HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
MASTER’S GRADUATION THESIS
Optimal deployment of intelligent mobile
air quality systems
NGUYEN VIET DUNG
Dung.NV202342M@sis.hust.edu.vn
Major: Data Science and Artificial Intelligence (Elitech)
Thesis advisor:
Assoc.Prof. Do Phan Thuan _________________
Institute:
School of Information and Communication
Technology
HA NOI, 09/2022
SĐH.QT9.BM11 Ban hành lần 1 ngày 11/11/2014
CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM
Độc lập Tự do Hạnh phúc
BẢN XÁC NHẬN CHỈNH SỬA LUẬN VĂN THẠC
Họ và tên tác giả luận văn: Nguyễn Việt Dũng
Đề tài luận văn:
Triển khai tối ưu các hệ thống quan trắc không kdi động
thông minh
Chuyên ngành: Khoa học dữ liệu và Trí tunhân tạo
Mã số SV: 20202342M
Tác giả, Người ớng dẫn khoa học Hội đồng chấm luận văn
xác nhận tác giả đã sửa chữa, bổ sung luận văn theo biên bản họp Hội đồng
ngày 29/10/2022 với các nội dung sau:
- Thêm giới thiệu chi tiết hơn v các nghiên cứu có liên quan trong
chương 2.
- Đổi tên chương 3 t Problem formulation & hardness thành
Problem formulation”.
- Thêm phát biểu vbài toán opportunistic sensing optimization trước
khi viết tắt thành OSO.
- Đổi tên phần 3.2 thành “Mathematical formulation of OSO”.
- Thêm giải thích rõ hơn vhàm mục tiêu và các điều kiện trong mục
3.2.
- Thêm lý do giải thích vì sao sử dụng thuật toán quy hoạch động:
“In this simplified scenario, our dynamic programming approach
guarantees that the set found by the submaxSet function is always
maximum. thus the number mentioned in the previous section
5.1.1.2 will be equal to 1. Later we will show that we cannot use
dynamic programming in the general scenario, and we will need
another greedy sub-process which has a lower performance ratio for
that.
- Thêm một sgiải thích chi tiết vcác thuật toán meta-heuristics và lý
do lựa chọn sử dụng chúng, cụ thể như sau:
+ “They are appropriate methods to verify efficiency of the
approximation algorithm, since their tremendous performance in
practice was shown in numerous research papers, especially
researches related to air monitoring systems. If the greedy
approximation approach is decent, the experimental results produced
SĐH.QT9.BM11 Ban hành lần 1 ngày 11/11/2014
by it should be competitive to the ones produced by the chosen meta-
heuristics. It is indeed true, and we will show the experimental results
supporting this observation later in this thesis.”
+ “Two meta-heuristics, the genetic algorithm and the simulated
annealing algorithm, are chosen to solve the OSO problem because of
their simplicity and efficiency in practice. Related researches about air
monitoring systems also deployed these methods to solve challenging
problems, and the results usually show that they are good choices for
creating a solution.”
- Thêm giải thích cho các hình vẽ và bng biểu.
- Thêm mô tả input và output cho các thut toán.
- Thêm mục 6.4. “Comparison of results between the approximation
algorithm and the meta-heuristics” và chuyển mục 6.4 thành mục
6.5. “Discussion”.
Ngày tháng năm
Giáo viên hướng dẫn Tác giả luận văn
CHỦ TỊCH HỘI ĐỒNG
3
Name: Nguyen Viet Dung
Phone: +84 399629097 Email : Dung.NV202342M@sis.hust.edu.vn
Student ID: 20202342M Class: 20BKHDL-E
Thesis title: Optimal deployment of intelligent mobile air quality systems
Thesis code: 2020BKHDL-KH01
Affiliation : Hanoi University of Science and Technology
I Nguyen Viet Dung - hereby warrants that the work and presentation in this thesis
performed by myself under the supervision of Assoc.Prof. Do Phan Thuan. All the results
presented in this thesis are truthful and are not copied from any other works. All
references in this thesis including images, tables, figures and, quotes are clearly
and fully documented in the bibliography. I will take full responsibility for even
one copy that violates school regulations.
Hanoi, 28th September, 2022
Author
Nguyen Viet Dung
Attestation of thesis advisor :
I certify that the thesis entitled “Optimal deployment of intelligent mobile air quality
systems” submitted for the degree of Master of Science (M.S.) by Mr. Nguyen Viet Dung is
the record of research work carried out by him during the period from 10/2020 to 10/2022
under my guidance and supervision, and that this work has not formed the basis for the award
of any Degree, Diploma, Associateship and Fellowship or other Titles in this University or
any other University or institution of Higher Learning.
Hanoi, 28th September, 2022
Thesis Advisor
Assoc.Prof. Do Phan Thuan
Graduation Thesis Assignment
4
In order to obtain this master's thesis, apart from my own efforts, it is impossible not to
mention the help of many other people.
First, I would like to thank Associate Professor Do Phan Thuan and Dr. Nguyen Phi Le, my
direct mentors. From the time I got my thesis title to the time I finished it, there was not a
moment that they didn't encourage me to run to the finish line. I am where I am today in
large part because of their support.
Next, I have to mention the funding source of VinIF. Their financial support helped me to
pay my tuition fees and complete my studies with peace of mind.
Finally, I would like to express my sincerest thanks to my teachers, friends and family.
Without them by my side, I wouldn't have made it to the end of the road.
Two years of wonderful lectures and extremely helpful time doing research will be in my
heart forever.
Acknowledgements
5
Monitoring air quality plays a critical role in the sustainable development of developing
regions where the air is severely polluted. Air quality monitoring systems based on static
monitors often do not provide information about the area each monitor represents or
represent only small areas. In addition, they have high deployment costs that reflect the
efforts needed to ensure sufficient quality of measurements. Meanwhile, the mobile air
quality monitoring system, such as the one in this work, shows the feasibility of solving
those challenges. The system includes environmental sensors mounted on buses that move
along their routes, broadening the monitoring areas. In such a system, we introduce a new
optimization problem named opportunistic sensing that aims to find (1) optimal buses to
place the sensors and (2) the optimal monitoring timing to maximize the number of
monitored critical regions.
We investigate the optimization problem in two scenarios: simplified and general bus routes.
Initially, we mathematically formulate the targeted problem and prove its NP-hardness.
Then, we propose a polynomial-time
-,


- approximation algorithm for the problem
with the simplified, general routes, respectively. To show the proposed algorithms’
effectiveness, we have evaluated it on the real data of real bus routes in Hanoi, Vietnam. The
evaluation results show that the former algorithm guarantees an average performance ratio
of 75.70%, while the latter algorithm achieves the ratio of 63.96%. Notably, when the
sensors can be on (e.g., enough energy) during the whole route, the


-approximation
algorithm achieves the approximation ratio of (1
). Such ratio, which is almost twice as


, enlarges the average performance ratio to 78.42%.
To further test the efficiency of the greedy approximation algorithm and optimize the results,
we propose two more meta-heuristic algorithms for this problem: genetic algorithm and
simulated annealing algorithm. Experiments show that the above meta-heuristic algorithms
only increase the goodness of the results by 1% to 3% on average, but have a much larger
running time than the greedy algorithm. From there, we see that the approximation algorithm
in particular is already a feasible solution in practice without mentioning any other
complicated tools.
Abstract
6
Graduation Thesis Assignment 3
Acknowledgements 4
Abstract 5
Content 6
List of Figures 8
List of Tables 9
Acronyms 10
Chapter 1. Introduction 11
1.1. Mobile air quality monitoring systems 11
1.2. Opportunistic sensing optimization (OSO) problem 12
1.3. Thesis contribution 12
1.4. Structure of thesis 12
Chapter 2. Related works 13
Chapter 3. Problem formulation 17
3.1. Problem statement 17
3.2. Mathematical formulation of OSO 18
3.3. Hardness of OSO 22
Chapter 4. Theoretical background 24
4.1. Approximation algorithms 24
4.2. Meta-heuristic algorithms 24
7
4.3. Research methodology 27
Chapter 5. Proposed solution 29
5.1. Approximation algorithms 29
5.2. Meta-heuristic algorithms 38
Chapter 6. Experimental results 42
6.1. Experimental settings 42
6.2. Numerical results of approximation algorithms 45
6.3. Numerical results of meta-heuristic algorithms 51
6.4. Comparison of results between the approximation algorithm and the meta-heuristics 61
6.5. Discussion 61
Chapter 7. Conclusion 63
Published papers 64
References 65
8
Figure 1. A map of size 4 × 4 with 3 bus routes and 6 critical squares. When = 2, an
example of the sensor’s turn-on positions on bus 1 is shown. With such selected positions,
that sensor can observe 5 critical squares , , , and . ................................................ 17
Figure 2. Illustration of observable boundary, observable square, and observable segment.
.............................................................................................................................................. 19
Figure 3. Illustration of Theorem 3.1’s proof ( is an arbitrary point on a bus route segment
. is the leftmost observable bound closest to . If is a critical square observable by ,
then it is also observable by ). ........................................................................................... 20
Figure 4. A corresponding bus map when = 3,
1
= {, , , , },
2
= {, , , },
and
3
= {, }. .................................................................................................................. 23
Figure 5. The remaining map after removing bus 1 from the map in Fig. 1, and the greedy
process continues.. ............................................................................................................... 29
Figure 6. (a) [l
Ab
,
Ab
] is the unique close segment that contains all sensor’s turn-on positions
on the bus route where the critical square is observed. (b) There are critical squares
observed by turning on sensor from bus route (in this figure, = 5). Each square can be
observed by a sensor turned on at somewhere in the middle of the interval [l
ib
,
ib
]. We then
have critical points which are the left endpoints (l
ib
, where = 1, … , ) of such intervals.
.............................................................................................................................................. 33
Figure 7. Efficiency heatmap.. ............................................................................................ 45
Figure 8. Performance in the simplified scenario with = 10, = 12. .............................. 46
Figure 9. Performance in the simplified scenario with = 25, = 30. .............................. 47
Figure 10. Performance in the simplified scenario with = 30, = 36. ............................ 47
Figure 11. Performance in the simplified scenario with = 42, = 50. ............................ 47
Figure 12. Performance in the general and special scenario with = 10, = 12. .............. 48
Figure 14. Performance in the general and special scenario with = 30, = 36. .............. 49
Figure 15. Performance in the general and special scenario with = 42, = 50. .............. 50
List of Figures
9
Table 1. Notation list………………………………………………………………...…… 18
Table 2. Meta-heuristics performance compared to the approximation algorithm’s results in
the simplified scenario…………………………………………………...……………….. 51
Table 3. Meta-heuristics performance compared to the approximation algorithm’s results in
the general scenario. ............................................................................................................ 55
List of Tables
10
Abbreviations and terms Meaning
OSO Opportunistic sensing optimization
GA Genetic algorithm
SA Simulated annealing
Fig. Figure
Acronyms
11
1.1. Mobile air quality monitoring systems
The fast industrialization and urbanization, especially in developing countries, cause air
pollution in urban areas. According to WHO, the polluted air is the main reason causing 36%
of deaths due to lung cancer, 27% of heart attacks, 34% of strokes, and 35% of deaths from
respiratory [1]. In such circumstances, it is indispensable to have a comprehensive solution
for monitoring air quality on a large scale for citizens and local governments. Accordingly,
there have been many air quality monitoring systems in literature, which can be roughly
classified into two main categories: stationary and mobile. The stationary system uses fixed
stations to monitor air quality, either outdoor [2] or indoor [3]. The air quality monitoring
system operates as a wireless sensor network (WSN) [46]. While the sensor nodes monitor
the surrounding environment, the base stations are in charge of storing and processing the
sensory data On the one hand, the sensor nodes monitor their surrounding environments. On
the other hand, the sensory data is either stored at the sensor’s local memory or transferred
to the base station. Despite the wide adoption, the stationary systems still suffer from an
inherent critical limitation: the low-resolution sensing data. That is because the fixed
monitoring station has the sensed data for only a limited area. Besides, the stations require
high deployment and maintenance costs. It is, therefore, challenging to deploy them densely.
For example, in Hanoi, Vietnam, the local government and other organizations have less
than 50 stations in the total area of 3329 km
2
[7].
Unlike the stationary system, the mobile one makes use of mobile sensors to broaden the
monitoring areas. The sensors can leverage unmanned aerial vehicles (UAVs) [8,9] or land-
based ones [10]. Despite having a significant advantage in capturing spatial information, the
UAV-based approach copes with many critical issues, including high deployment cost,
energy constraint, etc. Therefore, this work focuses on the mobile air monitoring approach
that exploits land-based vehicles. We consider the public buses, on which a battery-powered
sensor senses the environmental information along the bus route. In such a system, it is
essential to determine which buses to place the sensors on and schedule the measurement,
considering the limited number of sensors and their battery capacity constraint.
Chapter 1. Introduction
12
1.2. Opportunistic sensing optimization (OSO) problem
This thesis addresses the issues of the vehicle-based mobile air quality monitoring system.
We form a novel optimization problem named opportunistic sensing optimization (OSO),
which is as follows. Given the bus routes’ trajectories, each bus route includes two paths
sharing endpoints, the available monitoring sensors, and the locations of critical regions
that need to be monitored. Each sensor can measure at most positions on each bus path
due to the energy and computational resource constraints. The OSO problem then asks to
determine optimal buses to place the sensors and 2 positions on each sensor’s bus route
to perform the air quality measurement. The objective is to maximize the number of
observable critical regions. OSO can be seen as a hybrid problem that jointly optimizes the
trajectory (i.e., the bus route) and the schedule (the positions to perform the measurement)
of the mobile sensors.
1.3. Thesis contribution
• We provide a theoretical model and prove the NP-hardness of the OSO problem.
• We propose the polynomial-time constant approximations for the OSO problem. More
specifically, in general, our algorithm guarantees the performance ratio of


. In a
simplified case where the two paths of each bus route are identical, the guaranteed
performance ratio of our algorithm is
. We present theoretical proofs about these
performance ratios.
• We utilize two meta-heuristics algorithms: genetic algorithm and simulated annealing, to
further verify the efficiency of the approximation algorithm. Experimental results showed
that these meta-heuristics helped increase only 1-3% the number of observable critical
squares on average, and they were slower than the simple greedy approach.
• We present extensive experiments to evaluate the proposed algorithms’ performance.
1.4. Structure of thesis
The remainder of the thesis is organized as follows. Chapter 2 presents the related works.
We formulate the OSO problem and prove its NP-hardness in Chapter 3. Chapter 4 is a brief
explanation about the techniques used to solve OSO in this thesis. Chapter 5 describes our
proposed algorithms and theoretical analysis about their effectiveness. The algorithms’
performance in practice is discussed in Chapter 6. In the end, chapter 7 concludes the thesis.
13
There are many efforts in the previous works aiming to build air monitoring systems.
However, most of them use static monitoring sensors. The works in [10,11] introduced a
concept similar to our investigated system. However, they focus on realizing the sensor
device, systems rather than optimizing the deployment. The OSO problem in this work is
close to the sensor placement optimization and scheduling under the target coverage
constraint in static WSNs and mobile WSNs.
In [12], the relay node placement problem is mathematically formulated as an NP-hard
Steiner minimum tree problem with a minimum number of Steiner points and bounded edge
length. The authors then proposed two heuristic algorithms whose performance ratios are
2.5 and 3.0, respectively. In [13], F. Senel et al. proposed to divide the target coverage
problem into sub-problems, each of which contains only three sensors. In [14], Anxing Shan
et al. considered a network comprised of omnidirectional probabilistic sensors. The authors
studied how to activate the least number of sensors to detect all targets with a probability
higher than a threshold . In [15], the authors investigated the optimal deployment in the
wireless rechargeable sensor networks. Specifically, they studied how to deploy a minimum
number of sensors to cover all the targets under the sensors’ limited sensing angle and the
mobile charger’s energy constraint. The problem of deterministic deployment in 3D
underwater WSN is addressed in [16]. The authors exploited a nature-inspired evolutionary
algorithm named Cuckoo search to determine the optimal position for placing sensors. The
objective is to maximize the target coverage capability with a minimum number of sensors.
The authors in [17] addressed, at the same time, the target coverage, connectivity, and fault
tolerance problems in wireless sensor networks. They proposed a hybrid algorithm that
combines the greedy approach and spanning tree technique to determine a minimal number
of sensors. Unfortunately, all of the algorithms mentioned above consider only networks
with static sensors.
Concerning the target coverage problem in mobile WSNs, there are relatively rare related
works. In this chapter we highlight four remarkable researches related to our problem, which
are [18], [19], [20] and [21].
In [18], the authors considered the target coverage problem to minimize the moving distance
of all sensors. The problem was named k-Sink Minimum Movement Target Coverage (k-
Chapter 2. Related works
14
MMTC). m): They have k sink stations to send mobile sensors and to cover all targets on an
Euclidean space, k-MMTC is to schedule the sensor movement trajectories and minimize
the sum of moving distance. They proved that k-MMTC was NP-hard.
To solve that problem, they proposed a polynomial-time approximation scheme (PTAS),
named Energy Effective Movement Algorithm (EEMA). They divided EEMA into two
phases. In the first phase, they proposed a novel method to divide the surveillance region
into some sub-areas according to the locations of targets. The sensors in the same sub-area
can cover the same target set. In the second phase, they scheduled the mobile sensors and
move the sensors to cover all targets. They proved that ε > 0, EEMA can be an (1+ε)-
approximation algorithm for k-MMTC problem that runs in time O(
/
). For large scale
networks, they proposed a distributed version named D-EEMA. In particular, to keep the
connectivity of the sensors, they used some mobile sensors for communication. They called
these sensors as communication sensors which do not have sensing tasks. The
communication sensors just need to move around the targets and the stations to collect
sensing data. D-EEMA was divided into two phases. In the first phase, they divided the
surveillance region into some subareas and got the positions of the targets. In the second
phase, they grouped the targets and dealt with different groups respectively. They also
provided experiments to validate the effectiveness and efficiency of EEMA and D-EEMA.
In all, EEMA was the first PTAS for sensor movement scheduling for target coverage
problem.
Nguyen et al. in [19] focused on mobile WSNs where the sensors cannot cover all the targets.
In mobile wireless sensor networks, the movement of sensors consumes much more power
than that in sensing and communication. In that research, the targets are weighted by their
importance. The more important a target is given a higher weight. These requirements make
the problem interesting, and also difficult. The aim of that study is to study a more general
and practical problem in terms of target coverage and network connectivity, namely the
Maximum Weighted Target Coverage and Sensor Connectivity with Limited Mobile
Sensors (TAR-CC) problem. Originally, the TAR-CC problem is to schedule a limited
number of mobile sensors to appropriate locations to cover targets and form a connected
network such that the total weight of the covered targets is maximized. In addition, when the
transmission range is assumed to be large enough for any communication, a subproblem of
the TAR-CC problem, termed the Reduced TAR-CC (RTARCC) problem was also
introduced.
An approximation algorithm, termed the weighted maximum-coverage-based algorithm
(WMCBA), with an approximation ratio of 1−1/e is proposed for the RTARCC problem,
where e denotes the base of the natural logarithm, was proposed. In the WMCBA, all
15
possible sets of targets that can be covered by a mobile sensor located at any point in the
sensing field are considered. Then, a greedy method is used to select suitable sets of targets
to be covered by mobile sensors. Based on the WMCBA, the Steiner-tree-based algorithm
(STBA) is proposed for the TAR-CC problem. In the STBA, the Fermat points and a node-
weighted Steiner tree algorithm are used to find a tree such that the number of mobile sensors
deployed by the tree structure to form a connected network is minimized. Simulation results
demonstrate that even if the number of mobile sensors is high enough such that a connected
network can always be formed to cover all targets, the STBA requires a significantly lower
total movement distance than the best solution proposed for the MSD problem. In addition,
when the mobile sensors may be not enough to cover all targets, the STBA works better than
the greedy method proposed in the simulation section of that paper.
In [20], Rout et al. addressed the target coverage problem with the consideration of obstacle
avoidance. In that piece of work, they proposed a localized self-deployment scheme, named
as Obstacle Avoidance Target Involved Deployment Algorithm (OATIDA), for deployment
of randomly scattered mobile sensor nodes to cover predefined targets while maintaining
connectivity with the base station in the presence of obstacles. The proposed deployment
scheme is based on the following assumptions. They were: (i) All the sensor nodes have
locomotion capability and can move independently, (ii) The base station is fixed in any place
inside the region of interest and bears all the information about the targets. (iii) Initially, all
the sensor nodes are randomly deployed within the communication range of the base station
(iv) Each sensor node has one unique ID, (v) Every sensor node has the ability to know its
own coordinates by some localization method (e.g., GPS, triangulation and multilateration),
(vi) Every sensor node is able to acquire the relative position of the other sensor nodes within
its communication range, (vii) All the sensor nodes have circular sensing and communication
areas, (viii) The sensing field contains obstacles of arbitrary shapes, and (ix) Every sensor
node is able to detect the shape and position of any obstacles in its sensing range and can
calculate the nearest distance from the obstacle by using the time-of-flight method.
They used the concept of potential field theory and relative neighborhood graph for self-
deployment of mobile sensor nodes in an unknown environment to achieve target coverage
while preserving connectivity with the base station. Their proposed approach is localized in
the sense that each decision taken by the sensor node is strictly based on information
acquired from its neighboring sensors that are part of the relative neighborhood graph. That
algorithm works well in scenarios of single target, multiple targets and moving targets in the
presence of single or multiple obstacles. The proposed algorithm preserves connectivity
during the deployment procedure and minimizes the number of sensor nodes to maintain the
connectivity so that large numbers of sensor nodes are available for monitoring targets.
16
The problem of minimizing the mobile sensors’ moving distance under the target coverage
constraint is re-visited by Choudhuri et al. in [21]. The existing sensor relocation algorithms
for target coverage and connectivity are based on the assumption of the free mobility models.
Under these models, each sensor can move any amount in any direction. But for real life
situations, the movement of the sensors may be restricted. Here they intended to solve the
problem of target coverage with a rectilinear mobility model where a sensor can move only
along two mutually perpendicular directions. Since the (x, y) coordinates of a location can
be transformed to another set of coordinates by a simple rotation, they assumed in that paper
that a sensor can move along directions parallel to the x and y-axes only. Thus, if a sensor
moves from point u to v, the distance covered by it is the Manhattan Distance between u and
v.
Their algorithm worked in two phases. In the first phase, a subset of sensors are moved to
new locations such that all targets are covered. The sensors which are essential to ensure
coverage are called assigned sensors. Only if the first phase results in some assigned sensors
not connected to BS, the second phase is initiated to move some unassigned sensors to
achieve connectivity. The proposed algorithm initiates movement of the sensors only if it is
strictly necessary. Even though that work gives a way of relocating sensors when the initial
deployment does not ensure coverage, it can be also used at a later stage when some of the
sensors die out resulting in either loss of coverage or connectivity.
All existing works above on mobile WSNs focus only on either the trajectory or the sensors’
schedule. This thesis takes a more general approach, in which we address at the same time
both the optimal trajectories to place the mobile sensors and the optimal positions to perform
the measurement. We establish a complete process from formulating to solving our problem
and conducting experiments, which makes the results valuable in research and applicable in
real-life cases.
17
3.1. Problem statement
We introduce a new problem named opportunistic sensing optimization (OSO). A map of
bus routes is given on a grid of × squares on a 2-dimensional plane ( and are
predefined integers). Each square is marked either critical or non-critical. The critical
squares are regions that need to be monitored. The grid has a total of  critical squares (
× ). There are bus routes on the map, each of which consists of departure, arrival
endpoints, and two bus paths connecting two endpoints. A bus path is assumingly a fixed
polyline. A bus departs on one path and returns on another path. We assume that every bus
route has exactly one bus. There are ( ) air quality monitoring sensors that need to
be installed on buses, where each bus can have at most one sensor installed. We adopt the
disk-based model to represent the sensing capability of the sensors. Each sensor can observe
all points in the disk of radius centered at its position. Due to the energy and computation
resource constraint, a sensor can only be turned on at exactly positions on each path of a
bus route to measure the air quality (and thus, 2 positions in total on the two paths).
Otherwise, it has to be turned off immediately after a quick measurement. The measurement
time is negligible, and the turn-on positions of sensors are free to be chosen. However, they
are fixed before installation on the buses (see Fig. 1 for an example). We assume that the air
quality of every point inside the same square is almost identical. A square is then said
observable by a sensor if it intersects the circle of radius centered at a turn-on position of
the sensor.
Figure 1. A map of size 4 × 4 with 3 bus routes and 6 critical squares. When = 2, an
example of the sensor’s turn-on positions on bus 1 is shown. With such selected positions,
that sensor can observe 5 critical squares , , , and .
Chapter 3. Problem formulation
18
Table 1. Notation list.
Notation
Meaning
p
Number of columns in the bus map grid
q
Number of rows in the bus map grid
c
Number of critical squares
n
Number of bus routes
m
Number of sensors
r
Sensing radius of a sensor
k
Maximum number of time a sensor can be turned on a path
O(
)
Observable boundary of C
C(b)
The set of the leftmost and rightmost observable bounds of a bus
route b
The OSO problem asks to choose bus routes to install sensors and decide the sensor’s
turn-on positions on each of these routes to maximize the number of observable critical
squares. This work considers two scenarios: the general scenario and a simplified scenario
where the two paths on each bus route are identical.
3.2. Mathematical formulation of OSO
In this section, we introduce the definition and mathematically formulate the targeted
problem. To facilitate readability, we summarize all the notations in Table 1.
Definition 3.1 (Observable Boundary). Let be a critical square. The outer contour that has
a distance of to ’s boundary is called the observable boundary of , and denoted as O(
).
Fig. 2 illustrates two critical squares (
1
,
2
) and their observable boundaries (
1
) and (
2
).
Definition 3.2 (Observable Square). Let be a point on the plane. An observable square of
is a critical square that is monitored by a sensor located at ; the observable square set of
is the set of all observable squares of .
In Fig. 2,
1
is an observable square of X
1
and X
2
, while
2
is observable by only X
1
. The
observable square set of X
1
consists of
1
and
2
, while that of X
2
contains only
1
.
19
Figure 2. Illustration of observable boundary, observable square, and observable segment.
Proposition 1. A critical square
is an observable square of
if and only if
stays on the
boundary or inside of O(
).
Proof. is observable by if and only if the distance from to ’s boundary does not
exceed . This condition is equivalent to that stays on the boundary or inside O()
Definition 3.3 (Observable Segment). Suppose is a critical square and is a straight bus
route segment. The observable segment of concerning is defined by the portion of
staying on the boundary or inside O().
Moreover, we define the leftmost and rightmost observable bounds of concerning O() as
the first and last point of the observable segment when we move from one end of the other
end of the bus route.
Fig. 2 represents a bus route and its observable segments. Segment P
1
P
2
stays inside
O(
1
), thus its observable segment concerning
1
is itself. Segment P
5
P
6
has the left end
point, i.e., P
5
, stays inside O(
2
), thus the observable segment of P
5
P
6
concerning P
5
P
6
comprises of two end points, one is P
5
and the other is the intersection of P
5
P
6
and O(
2
),
i.e., I
6
. As the two end points of segment P
8
P
9
stays outside of O(
2
), P
8
P
9
’s observable
segment concerning
2
comprises of the two intersection points of P
8
P
9
and O(
2
), i.e., I
8
,
I
9
. In a special case, the observable segment of P
7
P
8
concerning
2
deduces to a point P
7
.
Proposition 2. Let
,
be a straight bus route segment and a point on
;
is a critical
square. Suppose I
1
I
2
is the observable segment of
concerning
.
is an observable square
of
if and only if
stays between I
1
and I
2
.
20
Proof. According to Proposition 1, is an observable square of if and only if stays on
the boundary or inside of O(
). On the other hand, stays on , thus, is an observable
square of if and only if belongs to the intersection of and O(
). i.e., I
1
I
2
Theorem 3.1. Suppose
is a bus route and C(
) is the set of
’s leftmost and rightmost
observable bounds concerning all the critical squares. Let
be an arbitrary point of
such
that
’s observable square set is not null. Suppose
is the leftmost bound of
’s observable
segment, closest to
. Then the observable square set of
is the subset of
’s.
Proof. Let us denote by the straight bus route segment containing , then obviously
must also contain . Suppose is an observable square of , we will prove that is also an
observable square of . We denote by P
u
P
v
the observable segment of with respect to ,
where P
u
and P
v
are the leftmost and rightmost observable bounds. As is an observable
square of , according to Proposition 2, must stay between P
u
and P
v
. Specifically, must
stay on the right side of P
u
and the left side of P
v
. As is the leftmost observable bound
closest to , must either overlap or stay on the right side of P
u
. On the other hand, as P
v
stays on the right side of , it also stays on the right side of . Consequently, stays between
P
u
and P
v
. Thus, must be an observable square of (see Fig. 3).
Figure 3. Illustration of Theorem 3.1’s proof ( is an arbitrary point on a bus route segment
. is the leftmost observable bound closest to . If is a critical square observable by ,
then it is also observable by ).
From Theorem 3.1, we deduce that to decide the optimal turn on locations of sensors on
buses, we just need to determine the optimal turn on points among the leftmost observable
bounds. We call these leftmost observable points critical points.
21
In the following, we present the mathematical formulation of the considered problem. We
define binary variables 


, 


,


,


,

,

,
as follows.



equals 1 if the critical square can be observed from the
th
critical point on the first
path of bus route , is 0 otherwise.



equals 1 if the critical square can be observed from the
th
critical point on the second
path of bus route , is 0 otherwise.


equals 1 if we choose to observe the critical square from the
th
critical point on the
first path of bus route , is 0 otherwise.


equals 1 if we choose to observe the critical square from the
th
critical point on the
second path of bus route , is 0 otherwise.

equals 1 if the
th
critical point on the first path of bus route is chosen to be a turn-on
position, is 0 otherwise.

equals 1 if the
th
critical point on the second path of bus route is chosen to be a turn-
on position, is 0 otherwise.
equals is 1 if a sensor is installed on bus route , is 0 otherwise.
Moreover, denoting
i1
,
i2
as the number of critical points on the first and second path of
bus route , our problem is mathematically formulated as follows.
Objective
Maximize






× (

×

×

+ 

×

×

)
Subject to
, 1 :





(

×

+ 

×

) 1 (1)
, 1 :



,



(2)

= (3)
22
The objective function that has to be maximized is simply the number of critical squares
covered. A square t is covered when four conditions below are satisfied:
- A bus route i is chosen to install sensor (i.e.
= 1).
- The square t can be observe from the j
th
critical point on the first path of i, i.e. 


= 1 (or
the second path of i, i.e. 


= 1).
- The j
th
critical point on the first path is chosen to observe t, i.e.


= 1 (or on the second
path is chosen to observe t, i.e.


= 1).
- The j
th
critical point on the first path is chosen to be a sensor’s turn-on position, i.e.

=
1 (or on the second path, i.e.

= 1).
That’s why the objective function is written as above. To make sure that an observed critical
square is not count twice or more, and to guarantee that we don’t use more than m sensors
or turn on them more than k times on some bus path, we need three additional constraints.
Constraint (1) ensures each observable critical square is counted at most once in the objective
function. Constraint (2) guarantees that there are at most turn-on positions on any path.
The last constraint depicts that there are bus routes picked to install a sensor.
3.3. Hardness of OSO
We prove the hardness of the OSO problem by the reduction theory using the known NP-
hard maximum coverage problem (MCP) [22], stated as below.
Given a number and a collection of sets = {
1
,
2
, … ,
β
}, the maximum coverage
problem require to find a subset of sets to maximize the number of covered elements
|


i
| such that | ′ | ≤ .
Theorem 3.2. OSO belongs to the NP-hard class.
Proof. We prove this by reducing from MCP to OSO. First, we set a bus map on a grid of
size || × 3 (columns × rows), where =

. Let the critical regions be at all squares
of the second row, hence each element in represents a square. We then need to build a bus
route including the depart and return parts for each set
i
,  = 1 … , such that one of its
paths does not touch any square on the map’s second row; and the other path goes through
only the squares represented in
i
. Such a route always exists by passing alternatively these
three rows at each column containing a square represented in
i
(see Fig. 4 for an example).
Finally, we set = , = , = 0, and = max
i
|
i
|.
23
Figure 4. A corresponding bus map when = 3,
1
= {, , , , },
2
= {, , , },
and
3
= {, }.
It is clear that if there is an optimal solution for OSO, the subset containing ’s elements,
which correspond to the routes in , is also an optimal solution for MCP. The optimal
solution for MCP can also be converted to an optimal set of bus routes and turn-on
positions for such instance of OSO in polynomial time. Therefore, MCP is proven to be not
harder than OSO. That means OSO is NP-hard.
In chapter 5, we introduce a greedy approximation algorithm for OSO. We prove that in a
general case, the algorithm gives an


. Moreover, in a simplified case where the
approximation ratio of two paths on each bus route are identical, it gives an approximation
ratio of
.
24
4.1. Approximation algorithms
In computer science and operations science, approximation algorithms produce near-optimal
results, which the furthest distance from the optimal result is always provable. Because of
the belief that P is different from NP, some problems are said to be unsolvable in polynomial
time, hence approximation algorithms are proposed as an alternative to finding the optimal
algorithm.
Any approximation algorithm must be accompanied by a theoretical proof that certifies the
efficiency of the result produced in the worst case. Algorithms of this class differ from meta-
heuristics, such as genetic or simulated annealing algorithms, in that meta-heuristic
algorithms do not guarantee the quality of results by theoretical proof, although in practice
they are often superior to approximation algorithms.
In this thesis, the greedy approximation algorithms provide results with provable efficiency,
thereby serving as the basis for other algorithms to compare.
4.2. Meta-heuristic algorithms
Meta-heuristic is the name for a group of algorithms that allow to find good enough results
for problems that do not have a polynomial time solution and the search space for a solution
is much larger than the computation capacity. These algorithms do not guarantee the optimal
solution, nor prove the efficiency of the solution as approximation algorithms. However,
there are still many proofs about the convergence ability and the probability of obtaining
optimal results of these algorithms. Many meta-heuristic algorithms are inspired by natural
processes, such as genetic algorithm, simulated annealing algorithm, swarm algorithm, and
more. They are often very good at experimentation, and can especially be used for many
different problems.
In this thesis, two meta-heuristic algorithms, genetic algorithm and simulated annealing
algorithm, are used as independent models to solve the OSO problem, which is an NP-hard
problem. They are appropriate methods to verify efficiency of the approximation algorithm,
since their tremendous performance in practice was shown in numerous research papers,
especially researches related to air monitoring systems. If the greedy approximation
Chapter 4. Theoretical Background
25
approach is decent, the experimental results produced by it should be competitive to the ones
produced by the chosen meta-heuristics. It is indeed true, and we will show the experimental
results supporting this observation later in this thesis.
4.2.1. Genetic algorithm
Genetic algorithm (GA) is a technique for solving combinatorial optimization problems.
Among the meta-heuristic algorithms, GA is a well-known algorithm, which is inspired from
the biological evolution process [28]. GA mimics the Darwinian theory of survival of the
fittest in nature. GA was proposed by J.H. Holland in 1992. The basic elements of GA are
chromosome representation, fitness selection, and biological-inspired operators. Holland
also introduced a novel element namely, Inversion that is generally used in implementations
of GA [29].
GA simulates the evolution of an organism. Its basic idea is to treat each solution (each
individual) as a chromosome, which contains many genes. Initially, a population of many
individuals is created. Over generations, depending on a given fitness function, individuals
will interbreed to create new individuals, or they will be mutated, or they will be eliminated
by natural selection, and so on. The process will stop when certain conditions are satisfied,
for example the quality of the population does not increase after a few consecutive
generations. During the whole process, the best instance that ever existed will be selected as
the result of the algorithm.
The skeleton of a simple genetic algorithm will have the following structure:
Algorithm 1: Genetic algorithm
Input: The definition of individual (solution) and gene, a fitness function
Output: The best solution found after the algorithm halted
Initialize a population P of POP_SIZE solutions
Initialize best solution s* = the best solution in P
while not meet STOP CONDITION do
Crossover (by probability p
cross
) pairs of solutions from P -> new solutions
Mutate (by probability p
mut
) solutions in P -> new solutions
Assign s* = the best solution found in P and newly created solutions
Build a new population P’ from P and newly created solutions based on f
Assign P = P’
return s*
There are lots of other variations for genetic algorithms, and the above procedure is one of
the most general instances. Note that:
26
- The initialized population should contain randomized solutions. Sometimes, the solutions
may be chosen from a specific area of the search space where optimal solutions are likely to
be found.
- The crossover operator takes two or more solutions (parents) and merges them in a way
that creates new feasible solutions (children). The children are supposed to inherit good
characteristics of their parents, hence likely being a better solution than their ancestors. In
each generation, many crossover operations can be done, depending on a given probability
p
cross
.
- The mutation operator takes one solution and modifies it to make a new solution, by a given
probability p
mut
. It enhances the diversity of the population, and avoids converging to a local
optimum.
- The new population P’ of the next generation can be greedily built from P and newly
created solutions. Occasionally, dummy or bad solutions are kept to ensure the diversity of
the population.
- There are many kinds of stopping conditions, for example: the best solution has not
changed for several generations, or the average fitness of the population has not increased
for a long time, etc.
4.2.2. Simulated annealing
Simulated annealing (SA for short) was first applied to optimization problems by S.
Kirkpatrick et al. [30] and V.Cerny [31]. In the book "Metaheuristics: From design to
implementation" of El-Ghazali Talbi [32], the author described almost every aspect of SA
in detail. It is a meta-heuristic to approximate an optimal solution in a large search space for
an optimization problem. The idea of the SA algorithm is derived from physical metallurgy.
The metal is heated to high temperatures and cooled slowly so that it crystallizes in a low
energy configuration.
SA is chosen as a method to solve this OSO problem because of its simplicity and efficiency.
It allows for a more extensive search for the global optimal solution, and can even find a
global optimal solution if it runs for enough time.
The skeleton of a simulated annealing algorithm will have the following structure:
Algorithm 2: Simulated annealing algorithm
Input: A way to compare two solutions, the function p(T, s, s’), the cooling function
Output: The best solution found after the algorithm halted
27
Initialize current solution s = any solution s
0
Initialize best solution s* = s
Initialize T = T
max
while T > 0 do
for i from 1 to L do
Generate a random neighbor s’ of s
if s’ is better than s then
Assign s = s’
if s’ is better than s* then
Assign s* = s’
else
Assign s = s’ with probability p(T, s, s’)
T = cooling(T)
return s*
Note that:
- In theory, the initial solution s
0
can be any valid solution and it does not affect the quality
of SA. However, when the solution searching space is too large, a good initial solution can
be a suitable approximation for the global optimum in a short amount of time.
- A neighbor s’ of s is a solution which is not much differ from s.
- The probability p(T, s, s’), also called move acceptance probability, must satisfy that better
solutions have more chance to be chosen; and the higher the temperature T, the more likely
that the move is going to be done.
- The cooling function cooling(T) returns a number slightly smaller than T, which
demonstrates a cooling process in metallurgy. In theory, the higher T
max
and L are the higher
chance for the optimal solution to be discoverable. However, to save computation energy,
these three hyperparameters should be carefully tuned.
4.3. Research methodology
The research method to solve the OSO problem includes the following steps:
- Step 1: Prove that the OSO problem is an NP-hard problem (done in the previous chapter).
Once proven, the optimal algorithm will no longer be searched, but instead we will look for
good approximation algorithms and meta-heuristic algorithms.
- Step 2: Build an approximation algorithm. The approximation algorithms are prioritized to
be searched first because the goodness of the solution they provide compared to the optimal
solution can be guaranteed. From there, we can use it as a comparison base for other
algorithms.
28
- Step 3: Construct meta-heuristic algorithms. In experiment, meta-heuristic algorithms are
really effective problem solving tools. In problems with large search spaces such as OSO
problems, meta-heuristics are often one of the first options to solve the problem that
experienced researchers will mention.
- Step 4: Create tests and run tests of the built algorithms. This is an indispensable step to
confirm whether the algorithms are effective in practice. The dataset is built directly from
the bus map of the city (in particular, Hanoi), thereby ensuring the objectivity of the test.
- Step 5: Based on the results, draw conclusions about the feasibility of the proposed
algorithms.
29
5.1. Approximation algorithms
5.1.1. Greedy algorithm and proof of effectiveness
5.1.1.1. Greedy approximation algorithm
We develop a greedy algorithm named grOSO to solve the problem. The algorithm will find
approximation ratios for both the general and the simplified scenarios. The main idea is as
follows. For each bus route, we desire to determine the maximum observable critical square
set, which is the biggest set of critical squares observable by a sensor mounted on that bus,
when the sensor is turned on at most times on each path of the bus route. We will use a
heuristic to find an observable critical square set whose size is as large as possible. The set
is called sub-maximum observable critical square set. The following process is repeated
times:
1. For each bus route, we heuristically determine a sub-maximum observable critical square
set.
2. The bus route with the largest sub-maximum observable critical square set is selected to
install a sensor.
3. After that, we remove the chosen bus route from the map and change all items of its sub-
maximum observable critical square set into non-critical squares.
Figure 5. The remaining map after removing bus 1 from the map in Fig. 1, and the greedy
process continues.
Chapter 5. Proposed solution
30
Finally, we have selected bus routes with 2 sensor’s turn-on positions on each route,
which are our greedy algorithm’s results. Fig. 5 shows the remaining map after removing
bus 1 from the map in Fig. 1. The critical squares , , , and are changed to non-
critical squares.
The grOSO pseudo-code is presented in Algorithm 3. The input includes the bus map, the
sensor radius (i.e., ), the number of sensors (i.e., ), and the number of turning on points
per sensor (i.e., ). The map  consists of the bus routes on a × grid and positions
of critical squares. The output is a set of observable critical squares. For the sake of brevity,
we do not output the selected bus routes and the sensors’ turn-on positions. The function
submaxSet(busMap, r, k, j) returns a sub-maximum observable critical square set of the bus
route given the information of the bus map , the sensor radius and the sensor’s
turn-on limit . This submaxSet function is calculated in polynomial time by a greedy
process discussed in Section 5.1.2.
Algorithm 3: Greedy approximation algorithm
Input: Bus map busmap, sensor radius r, number of sensors m, no. turn-on time k
Output: The constructed solution S
procedure grOSO(, , , )
for ← 1 to do
, ← 0 : bus route to install sensor
for ← 1 to do
if Bus route has not been chosen yet then
(, , , )
else
if | | ≥ || then
,

Bus route is marked as chosen
Squares in
is changed to be non-critical
return
5.1.1.2. The lower bound of performance ratio
The efficiency of our greedy algorithm mainly depends on the function submaxSet. The
obtained approximation ratio follows the approximation method of C. Chekuri and A. Kumar
[23] for MCP with group budget constraints. We prove that for any 1, if the algorithm
submaxSet always returns a sub-maximum observable critical square set which can cover at
least
as many critical squares as a maximum observable critical square set of a bus route,
31
then our greedy approach is a

-approximation algorithm. By using this claim, in Section
5.1.2, when a greedy method is used for the function submaxSet with = 1
, our
algorithm for the OSO problem is proved to be

=


-approximation. After that, by
applying a dynamic programming method to a simplified scenario of OSO, we extend this
result to show another approximation algorithm of ratio
. At the end of this section, we
present a simpler version of C. Chekuri and A. Kumar’s proof that includes modifications
fitting our problem formulation.
We assume that the greedy algorithm chooses the bus route in the
th
iteration of the outer
for-loop by renumbering the bus routes. Let  be a fixed optimal solution, and
1
,
2
, …,
m
be the indices of the bus routes that  installs sensors. We additionally assume that if
both the solutions of the greedy algorithm and  choose the bus route , then
i
= , by
reordering the chosen bus routes in  . Let
i
be the set of new observable critical squares
determined by the greedy algorithm when the bus route is chosen. Let
i
be the covered
set of new observable critical squares determined by  when the bus route
i
is chosen.
Moreover, =

and =

are denoted as the numbers of observable critical
squares that are covered by the greedy algorithm and  . We need to prove that for any
1, if the algorithm submaxSet always returns a sub-maximum observable critical square
set which can cover at least
as many critical squares as a maximum observable critical
square set of a bus route, then ||

||. To fulfill this task, the following critical
observation (Lemma 5.1) is indispensable.
Lemma 5.1. For any
1, if the algorithm submaxSet always returns a sub-maximum
observable critical square set that can cover at least
as many critical squares as a
maximum observable critical square set of a bus route, then , 1≤
: |
i
| ≥
|
i
|.
Proof. If
i
= O
i
, O
i
. So |
i
| ≥
|O
i
| = 0. We assume that
i
≠ O
i
, which means O
i
is not chosen to be covered in the
th
operation of the outer for-loop in the greedy algorithm.
Since submaxSet always returns a sub-maximum observable critical square set with the
covering condition,
i
will be at least
as large as the maximum size of all maximum
observable critical square sets of all bus routes. Consequently, we have |
i
|
|O
i
-


|. In addition,


, therefore |
i
| ≥
|
i
|.
32
Lemma 5.1 shows that the number of newly covered squares
i
is always not less than
of
the number of squares that are covered by the bus route j
i
in  but not covered by the
greedy algorithm. With this lemma, our task is achieved by the following theorem.
Theorem 5.1. For any
1, if the algorithm submaxSet always returns a sub-maximum
observable critical square set that can cover at least
as many critical squares as a
maximum observable critical square set of a bus route, then the greedy algorithm grOSO is
a

-approximation algorithm for the OSO problem.
Proof. We have to prove that || ≥

||. Since
|| =

|
|

|
|
(|

| ||) =
||
|| ,
therefore

||
|| ||

||.
5.1.2. Sub-maximum observable set calculation
In grOSO, the function submaxSet(busMap, r, k, j) returns a sub-maximum set of observable
critical squares on the bus route when the bus map , the sensor radius , and the
sensor’s turn-on limit are given. To guarantee the polynomial running time of grOSO,
submaxSet ’s time complexity has to be polynomial. In the following, we introduce our
approaches for the simplified and general scenarios.
5.1.2.1. Simplified scenario
5.1.2.1.1. Dynamic programming-based approximation algorithm
The simplified scenario indicates the two paths of every bus route are identical. Specifically,
each bus route’s two paths are the same polyline; and the total number of allowed turn-on
positions to be determined on that polyline is 2.
The NP-hard proof in this scenario is straightforward since the same method in Section 3.3
can be used. We first make an assumption which is reasonable since it usually appears in
practical as follow. For any pair of (, ), where is a critical square and is a bus route,
if a sensor can observe from some positions on , all sensor’s turn-on positions on where
is observed are lying on exactly one closed segment [l
Ab
,
Ab
], see Fig. 6a for an
illustration. This interval can be found in polynomial time by using computational geometry
techniques. We then present a dynamic programming approach for the function submaxSet
to construct the maximum observable critical square set for any bus route. In this simplified
scenario, our dynamic programming approach guarantees that the set found by the
33
submaxSet function is always maximum. thus the number mentioned in the previous
section 5.1.1.2 will be equal to 1. Later we will show that we cannot use dynamic
programming in the general scenario, and we will need another greedy sub-process which
has a lower performance ratio for that.
Suppose that there are critical squares which can be observed from at least one sensor’s
turn-on position on a bus route, as the above assumption, there are critical points which
are the left endpoints (l
Ab
) of the observable intervals ([l
Ab
,
Ab
]) (an illustration in Fig. 6b).
It is obvious that a sensor should only be turned on at critical points for the highest efficiency.
Figure 6. (a) [l
Ab
,
Ab
] is the unique close segment that contains all sensor’s turn-on positions
on the bus route where the critical square is observed. (b) There are critical squares
observed by turning on sensor from bus route (in this figure, = 5). Each square can be
observed by a sensor turned on at somewhere in the middle of the interval [l
ib
,
ib
]. We then
have critical points which are the left endpoints (l
ib
, where = 1, … , ) of such intervals.
The below steps describe our method.
1. The points are sorted in an increasing order of their distances to the bus route’s starting
point.
2. We compute (, ) for all pair (, ), 0 < , which is the number of critical squares
observable from the
th
critical point but not observable from the
th
critical point (the 0
th
critical point is set as −∞).
34
3. Let (, ) be the maximum number of observable critical squares if we are allowed to
turn on the sensor times at critical points from the (+1)
th
point to the
th
point (a sensor
can be turned on more than one time at the same point). The function submaxSet should
return (0, 2). The dynamic programming formula for is following.
• Base case: (, 0) = 0, where 0 ≤ ;
• General case: (, ) = max
u
((, − 1) + (, )), where 0 ≤ < and > 0.
The pseudo code of submaxSet is represented in Algorithm 4.
Algorithm 4: Dynamic programming for submaxSet
Input: Bus map busmap, sensor radius r, no. turn-on time k, the current bus b
Output: The maximum number of critical squares a sensor on b can observe, and a
maximum set of critical squares that sensor can observe.
procedure submaxSet(, , , )
critical square , calculate the interval [l
Ab
,
Ab
]
Sort critical points in increasing order
Compute (, ) for all (, ), 0 ≤ <
for ← 0 to 2 do
for ← 0 to do
(, ) ← 0
if > 0 then
for + 1 to do
 (, − 1) + (, )
if (, ) <  then
(, ) = 
← an arbitrary set of (0, 2) critical squares which can be observed when a
sensor is installed on bus route
return (0, 2) and
5.1.2.1.2. Efficiency analysis
Since Algorithm 4 produces the maximum observable critical square set for any bus route ,
from Theorem 5.1, the algorithm grOSO is

=

=
-approximation for this simplified
version of OSO.
5.1.2.2. General scenario
5.1.2.2.1. Greedy approximation algorithm
The general scenario indicates there is no constraint on the bus paths. Following Proposition
1 in Section 3.2, a sensor should only be turned on at critical points for the highest efficiency.
35
The following steps show how to find a sub-maximum set of observable critical squares on
the bus route .
1. Determining all critical points on ;
2. For each critical point , we build a set of all critical squares that a sensor can observe
when it is turned on at .
3. Considering ’s two paths at the same time, we choose each critical point in turn to be a
sensor’s turn-on position until critical points are chosen on each path. Note that is chosen
as a critical one if its corresponding set contains the maximum number of critical squares
uncovered by previously chosen critical points.
The pseudo code of submaxSet is in Algorithm 5.
Algorithm 5: A greedy method for submaxSet in general case
Input: Bus map busmap, sensor radius r, no. turn-on time k, the current bus b
Output: The number of critical squares a sensor on b can observe by utilizing the greedy
method, and a maximum set of critical squares that sensor can observe.
procedure submaxSet(, , , )

1
← set of all crit. points on 1
st
path of

2
← set of all crit. points on 2
nd
path of
, 
1
← 0, 
2
← 0
for ← 1 to 2 do
1
′ ← ,
2
′ ←
for ← 1 to 2 do
for 
j
do
← set of all crit. sqr. observable from
if | | ≥ |
j
′ − | then
j
′ ←
if 
2
= or (
1
< and |
1
′ | ≥ |
2
′|) then
1
′ , 
1

1
+ 1
else
2
′ ,

2

2
+ 1
return || and
5.1.2.2.2. Efficiency analysis
We will show that submaxSet in the previous section always produces a result of size at least
(1 −
) the size of a maximum observable critical square set for any bus route, leading to an


approximation ratio for the greedy algorithm grOSO.
36
Let
b
be the element number in a maximum observable critical square set of the bus route
. Let
i
be the number of new observable critical squares added to the sub-maximum
observable critical square set by submaxSet at the th iteration (0 ≤ 2, assume that
0
=
0). We denote  as the number of covered critical squares after the first iterations of
submaxSet (hence
=

), and
i
as
b
i
. Our objective is to prove that
2k
(1 −
)
b
. To fulfill this task, we first claim two below lemmas.
Lemma 5.2. For all 0 ≤
< 2
:
i+1

Proof. At the ( + 1)
th
iteration of submaxSet (0 < 2), we consider all critical points on
that have not been selected. There must be some sets of at most 2 points among these
critical points such that if a sensor is turned on at all points in this set, then at least
b
i
=
i
new critical squares will be covered. Otherwise, there will not be any way to cover
b
critical squares with 2 turn-on positions, which is a contradiction. Consequently, there must
be some non-chosen critical point that can observe at least

new critical squares. The
critical point selected at the ( + 1)
th
iteration is the one which observes the most number of
uncovered critical squares. Hence, it can also observe at least

new critical squares, thus
i+1

.
Lemma 5.2 helps us to prove another Lemma 5.3 which directly leads us to our goal.
Lemma 5.3. For all 0 ≤
< 2
:

(1

)

.
Proof. We prove using the induction hypothesis.
* Base case:
1
≤ (1 −

)
b
b
1
b



(Lemma 5.2).
* Inductive step: We have to prove that for any value (0 ≤ < 2 1), if the lemma is true
when = , then the lemma would also be true when = + 1. Indeed,

=





[Lemma 5.2] = (1 −

)

(1

)

[ ]
By the induction hypothesis, the lemma is proved.
Theorem 5.2, which is our goal, can be now proved by Lemma 5.3 and Theorem 5.1 shown
in Section 5.1.1.2.
37
Theorem 5.2. By taking into account Algorithm 5 to solve submaxSet, the polynomial
algorithm grOSO is an


-approximation for OSO.
Proof. From Lemma 5.3,

(1

)



(1
)
.
Therefore, submaxSet always produces a set of size at least

=

the size of a maximum
observable critical square set for any bus route. With the use of Algorithm 5 and Theorem
5.1, we can conclude that grOSO is a polynomial


-approximation algorithm for OSO.
5.1.2.2.3. Special case
 is a special case because when , the result produced from submaxSet ’s result will
be the maximum observable critical square set, which also contains all critical squares
observable from at least one point on the bus route. Hence, submaxSet will always return an
unique result when . By reducing from MCP, we can see that this case is also in NP-
hard. This case represents the scenario where energy and computation resource constraints
are released. More specifically, the sensors are allowed to be turned on for the whole bus
route instead of just turned on at particular positions.
Following Theorem 5.1, we can derive when , our greedy algorithm introduced in
grOSO will be a
-approximation algorithm since submaxSet always gives a maximum
observable critical square set. Actually, the approximation ratio is even better than 21 in
theory. If we consider the bus routes as critical points on a single bus route and the results
produced from submaxSet as sets of critical squares that these critical points can observe.
Then by using the same technique to prove Theorem 5.2, we can show that our greedy
algorithm is a (1
)-approximation algorithm. We summarize this important result as the
following theorem.
Theorem 5.3. If

, with the utilization of Algorithm 5, grOSO is a polynomial (1
)-
approximation algorithm for OSO.
5.1.3. Time complexity analysis
Theorem 5.4. The time complexity of grOSO (in the simplified and general cases) for the
problem OSO is
(
×
× (the calculation time of

))=
(
×
×
×

2
).
38
Proof. Let a constant integer number be the maximum number of segments that a bus route
may contains. In the simplified scenario, calculating the interval [l
Ab
,
Ab
] for all critical
squares takes () = (); sorting critical points takes ( log ); computing function
takes (
2
); computing function takes (
2
); and making the set takes (). Therefore,
it takes ( +  log  + 
2
+ 
2
+ ) = (
2
) time to calculate the function .
That leads to the time complexity of ( × × × 
2
) for .
In the general scenario, calculating the set of all critical points on both paths takes () =
(). Moreover, it takes ( ×  × ) = (
2
) to make the set . Therefore, the function
 needs ( +
2
) = (
2
) time for its calculation, which leads to the time
complexity of ( × × × 
2
) for function .
If , from any selected bus route, we can observe all critical squares that are at most a
distance away from it. Hence, is assumed to be at most . Since , we have ( ×
× × 
2
) = (
2
× 
3
). In a real condition, and are usually small, and << .
Therefore, the algorithm can produce results within an acceptable period, as shown in the
next section.
5.2. Meta-heuristic algorithms
Two meta-heuristics, the genetic algorithm and the simulated annealing algorithm, are
chosen to solve the OSO problem because of their simplicity and efficiency in practice.
Related researches about air monitoring systems also deployed these methods to solve
challenging problems, and the results usually show that they are good choices for creating a
solution. We have tried plentiful variations of genetic algorithm and simulated annealing
algorithm to solve OSO. In this thesis, we only show the best performed variation for each
of these meta-heuristics. There can be better versions for them, and nothing guarantees that
our algorithms are optimal among all. However, by experiment in the next chapter, it can
still be shown that our variations are efficient and highly practical in real life.
5.2.1. Genetic algorithm
If we consider each solution as an individual which consists of chosen bus routes and chosen
sensor’s turn-on positions as genes, then we can easily apply the genetic algorithm to solve
OSO. There is no constraint which restricts inter-buses sensors’ turn-on positions, and there
is also no constraint restricting intra-buses sensor’s turn-on positions, therefore solutions for
OSO are very free to be created, which means meta-heuristics like the genetic algorithm are
very suitable for OSO. There are many versions of genetic algorithms that we have tried. All
of them use the same outline shown in Algorithm 1 in Section 4.2.1. The differences between
39
them are illustrated in the algorithm’s operations (i.e. crossover, mutation and selection),
such as:
- Mutation: there are two kinds of mutation - swapping a bus route and swapping a critical
point. In one variation, we randomly swap a bus route for another unpicked bus route (with
critical points chosen randomly) to make a new solution, or we randomly swap a critical
point on a bus route for another unpicked critical point on that route to make a new solution.
On the other hand, in another variation, the newly swapped bus route’s critical points are
greedily chosen by a tactic similar to our approximation algorithm, and we don’t swap
critical points. Experiments showed that the first variation was better.
- Selection: there are two kinds of selection tactics - greedy and probabilistic selection. In
the greedy selection tactic, we simply keep the best solutions in the current generation to be
the individuals in the next generation. In the probabilistic selection tactic, we choose to keep
individuals by a probability proportional to their efficiency. Despite the simplicity, the
greedy selection tactic still produced better results than the probabilistic variation.
In this thesis, we only show the best variation, which is the combination of the best mutation
and selection operations described above. Here is the pseudo-code of our best genetic
algorithm for OSO.
Algorithm 6. Genetic algorithm for OSO
Input: Bus map busmap, sensor radius r, number of sensors m, no. turn-on time k
Output: The best solution found
procedure gaOSO(busMap, r, m, k):
Initialize a randomized population P of size POP_SIZE
Initialize best solution s* = the best solution in P
while not meet STOP_CONDITION_1 and STOP_CONDITION_2 do
Create a new population P’ = P
for each solution s in P do
Crossover s with another solution s’ in P -> new solution q
1
Swap a bus route in s with another bus route -> new solution q
2
Change a sensor’s turn-on position on a route -> new solution q
3
P’ = P’ {q
1
, q
2
, q
3
}
Sort all solutions in P’
Keep the best POP_SIZE elements in P’ and remove the remaining ones
Assign P = P’
Assign s* = best solution in P
return s*
Note that:
- The fitness function is simply the number of observed critical squares by a solution.
40
- In the initialized population P, every bus route should be included in at least one solution.
This can be done by fixing a bus route and randomizing the other (m - 1) bus routes when
making a solution.
- STOP_CONDITION_1 will be triggered when the best solution s* does not change after
several generations. STOP_CONDITION_2 will be triggered when the average number of
critical squares observed in each solution in P does not increase after several generations.
- The solution s’ to be crossover with the solution s is picked with probability proportional
to its number of observable critical squares. The crossover operation between s and s’ is
done by randomly picking the bus routes in these two solutions, until m routes are chosen.
The sensor’s turn-on positions are unchanged.
- When swapping a bus route to create a new solution q
2
, an existing route in s is removed,
and another unpicked route is added with randomized sensor’s turn-on positions.
5.2.2. Simulated annealing
Similar to the genetic algorithm, there are many versions of simulated annealing that we
have tried. All of them use the same outline shown in Algorithm 2 in Section 4.2.2. The
differences between them lie in the neighbor generation process. In particular, three different
neighbor generation processes were tested:
- A neighbor is created by either swapping a bus route or swapping a critical point on a route.
When swapping a bus route, the newly added route’s critical points are randomly chosen.
- A neighbor is created by either swapping a bus route or swapping a critical point on a route.
When swapping a bus route, the newly added route’s critical points are greedily chosen by
a tactic similar to our approximation algorithms.
- A neighbor is created only by swapping a bus route. The newly added route’s critical points
are greedily chosen by a tactic similar to our approximation algorithms.
In experiment, using the same set of hyper parameters, the last variation among these three
is the best neighbor generation process. In this thesis, we only show that best variation. Here
is the pseudo-code of our best simulated annealing algorithm for OSO.
Algorithm 7. Simulated annealing algorithm for OSO
Input: Bus map busmap, sensor radius r, number of sensors m, no. turn-on time k
Output: The best solution found
41
procedure saOSO(busMap, r, m, k):
Initialize current solution s = any solution s
0
Initialize best solution s* = s
Initialize T = T
max
while T > 0 do
for i from 1 to L do
Generate a random neighbor s’ of s by swapping a bus route
if s’ is better than s then
Assign s = s’
if s’ is better than s* then
Assign s* = s’
else
Assign s = s’ with probability p(T, s, s’) =
(󰆓) ()
T = T - T
dec
return s*
Note that:
- When swapping a bus route of s to get s’, an existing route in s is removed, and another
unpicked route is added of which sensor’s turn-on positions are greedily chosen so that the
number of newly observable critical squares is maximized. This idea is pretty similar to the
greedy method of choosing turn-on positions in our proposed approximation algorithms.
- The probability p(T, s, s’) is equal to
(󰆓) ()
, where obs(s) is the number of
observable critical squares of solution s.
42
6.1. Experimental settings
6.1.1. Goals and test data generation
We aim to integrate realistic conditions in the performance evaluation of grOSO. This article
assumes that the air quality indicators remain constant throughout a bus’s trip from the first
to the last bus terminals. This assumption is reasonable since bus routes in major cities (e.g.
Hanoi) often take only several hours. Therefore, a critical region is considered to be
monitored by a sensor if it falls inside the sensor’s sensing area at a turn-on timing. As a
result, our algorithm is only dependent on the bus routes’ trajectory, not on the particular
flow of each bus. Consequently, we need to utilize only the bus map in this experiment, not
the traffic simulator. Specifically, we construct the evaluation settings following the
situation in Hanoi, the capital of Vietnam. The bus map’s size is approximately 20 km × 30
km. Within the map, there are 60 bus routes, which are based on the actual ones. We take
advantage of the experiments to examine the impacts of parameters on the algorithm’s
performance. As proved in Section 5.1.2, grOSO is always at least as good as
= 50% and


38.7% of the optimal solution in the simplified scenario and the general scenario,
respectively. However, we expect the experimental ratio to be better.
The bus map is divided into a grid of × squares (i.e., grid cells) in each test case. Let a
positive integer , a real positive number be the number of critical squares and the sensor
radius, respectively. Note that the value unit of is the grid cell ( = means that the actual
size in practical of equals times the edge size of a grid cell on that bus map). We randomly
mark  cells on the grid as critical squares. Therefore, each test case is characterized by four
main parameters , , , and .
6.1.2. Evaluation methodology
We consider two performance metrics to evaluate our algorithm: running time and
efficiency. The running time is the time measured in second for running the algorithm. We
determine the program’s running time for every test case. The algorithm is composed in C++
and executed on a computer with the configuration Intel(R) Core(TM) i7-10850H CPU, 16
GB RAM, and 512 GB SSD. The efficiency of the answer is defined by the ratio of the
Chapter 6. Experimental results
43
number of monitored critical regions achieved by the algorithm to that of the optimal
solution.
We run the algorithm on the bus map, considering all possible values of the sensor number
and the sensor’s turn-on limit . For each pair of (, ), we try to determine the ratio of
the number of covered critical squares in that solution to the optimal answer. However, since
OSO is NP-hard, it is impossible to find optimal solutions in a reasonable time for all
experimental test cases. Hence, we measure the solution’s efficiency by an upper bound of
the maximum number of coverable critical squares instead. Suppose that is our answer,
which is a set of observable critical squares, is the set of observable critical squares in an
optimal solution, and Ô is an upper bound of ||. Because || ≤ Ô we have:
Efficiency value =
||
||
||
Ô
. Therefore, instead of
||
||
, we determine
||
Ô
as the algorithm’s
efficiency. The value of Ô can be obtained as follows.
On the original map, for each bus route , we find an upper bound
b
of observable critical
squares from a sensor on the route. After that, we sum up largest numbers from these
obtained ones. We denote this sum as
and as the number of critical squares that cannot
be observed from any bus route. We then set Ô = min(
,  ). It is straightforward to
prove Ô to be an upper bound of ||. The number
b
can be derived as the smaller number
between two return values obtained by the following two algorithms.
Algorithm 8. Calculating
b
Input: Original bus map originalBusMap, sensor radius r, number of turn-on time k, the
current bus b
Output: An upper bound for the number of critical squares a sensor on b can observe
procedure calculate_
b
_1(

,
,
,
):
return

(

,
,
,
) ×

Algorithm 9. Calculating
b
Input: Original bus map originalBusMap, sensor radius r, number of turn-on time k, the
current bus b
Output: An upper bound for the number of critical squares a sensor on b can observe
procedure calculate_
b
_2(, , , ):
Consider the original bus map, for every critical point on both paths of , make a
set of all critical squares observable from that point.
Sort these sets in decreasing order of their size.
Initialize a variable  ← 0 and iterate through these sets. For each set , if is
44
not a subset of some ‘‘marked’’ set, then assign   + || and mark . Continue
this process until 2
sets are marked or all the sets are iterated through.
return .
We will show that both algorithms produce a feasible value for
b
. Indeed, Algorithm 8 can
be proved by directly utilizing the proof of Theorem 5.2.
On the other hand, to demonstrate Algorithm 9’s correctness, we assume there are critical
points on a bus route . Moreover, the sets of all critical squares observable from each of
these points are sorted in decreasing order of their sizes. Denoting these sets as
1
to
, so
that |
1
| |
2
| |
|. Without loss of generality, we assume that the sets are pairwise
different and ≥ 2. If we can prove that the size of any union of 2 arbitrary sets selected
from {
1
,
2
, … ,
} is always smaller than the return value of Algorithm 9, then Algorithm
9 will be proved to produce a feasible value for
b
.
Let (
,
, … ,

) where
1
<
2
< <
2k
be the sets that Algorithm 9 marks; and let
(
,
, … ,

) where
1
<
2
< <
2k
be any group of 2 sets selected from {
1
,
2
,
… ,
}. For all where 1 ≤ 2, denote
as the index such that
is the largest set (in
size) in {
1
,
2
, … ,
} which is a superset of
(if there are many available values for
, choose the smallest value to assign to
). We can see that there is no (1 ≤ ≤ 2) such
that
{
1
,
2
, ,
2k
} and
<
2k
. Because if there is one index
satisfies these
conditions, due to Algorithm 9 there must be a set in {
,
, … ,

} which is a proper
superset of
, therefore
will not be the largest set to be a superset of
, which is a
contradiction. By this observation, it is clear that for all (1 2), either
is in {
,
, … ,

} or
>
2k
. Consequently, we have:
|

| ≤ |

| ≤ |
| + |
| + … + |

|,
and this proves the correctness of Algorithm 9.
We mention two different algorithms instead of one because each of them has its advantages
and disadvantages. Algorithm 9 performs better than Algorithm 8 when there are not many
critical squares on the map, and vice versa. Because when the density of critical squares is
low, two distinct sets of observable critical squares produced from two critical points tend
not to have subset/superset relation, so the result from Algorithm 9 is usually tighter. On the
other hand, when there is a high density of critical squares, the opposite happens, which
worsens Algorithm 9. Another reason is that as grows larger, the subset/superset relation
45
appears more frequently. Therefore, Algorithm 9 is often a better choice when is small and
vice versa.
For the special case  and the simplified scenario mentioned in Section 5.1.2.1, instead
of using Algorithm 8 and 9, the number
b
can be acquired directly from Algorithm 10 below
since it always returns the max. number of observable critical squares from a bus route .
Algorithm 10. Calculating
b
Input: Original bus map originalBusMap, sensor radius r, number of turn-on time k, the
current bus b
Output: An upper bound for the number of critical squares a sensor on b can observe
procedure calculate_
b
_3(

,
,
,
):
return (, , , )
6.2. Numerical results of approximation algorithms
6.2.1. Impacts of parameters on efficiency
Since the results in all the test cases have a similar trend, we present the heatmap of the
efficiency in the simplified scenario with = 25, = 30, = 2 and  = 100 and the general
case with = 30, = 36, = 0.5 and  = 100 in Fig. 7(a), Fig. 7(b), respectively. The
heatmaps show in most cases of and , our solution is optimum. We can observe that in
each test case with fixed , the efficiency is similar for almost all values of . The reason is
that for every bus route, a sensor only needs to be turned on a few times to monitor all critical
squares observable from that route. On the other hand, when is fixed and increases, the
efficiency drops from 100%, until reaches a specific value, often when is around 5.
Then, it increases rapidly to turn back to and stay at 100%. Indeed, Ô grows faster than ||,
however it immediately stops growing at the value  . Especially, when 15 and
10, we almost always cover Ô critical squares, which shows that the solution is also optimal.
Figure 7. Efficiency heatmap.
46
6.2.2. Minimum efficiency
This section presents the minimum efficiency values when applying the algorithm with all
combination of and in the simplified and general scenarios. Figs. 8, 9, 10, 11 shows the
results for the simplified scenario with the grid size ( × ) is (10 × 12), (25 × 30), and (42
× 50), respectively. In each figure, there are two subfigures. The (a) subfigure shows the
minimum efficiency for each case of (r, c), given p and q. The (b) subfigure shows the
average running time for these cases, where each line corresponds to a sensor’s radius. There
is a total of 58 test cases for such scenario.
On the other hand, in the general scenarios, there are 114 test cases, which are associated
with four pairs of (, ) (i.e., (10, 12), (25, 30), (30, 36), (42, 50)). The results are shown in
Figs. 12, 13, 14, 15. In each figure, there are three subfigures. The (a) subfigure shows the
minimum efficiency for each case of (r, c), given p and q. The (b) subfigure shows the
minimum efficiency for each case of (r, c) in the corresponding special scenario, given p and
q. The (c) subfigure shows the average running time for these cases, where each line
corresponds to a sensor’s radius.
(a) Minimum efficiency (%)
(b) Average running time (s)
Figure 8. Performance in the simplified scenario with = 10, = 12.
47
(a) Minimum efficiency (%)
(b) Average running time (s)
Figure 9. Performance in the simplified scenario with = 25, = 30.
(a) Minimum efficiency (%)
(b) Average running time (s)
Figure 10. Performance in the simplified scenario with = 30, = 36.
(a) Minimum efficiency (%)
(b) Average running time (s)
Figure 11. Performance in the simplified scenario with = 42, = 50.
48
(a) Minimum efficiency (%) for the
general scenario
(b) Minimum efficiency (%) for the
special scenario
(c) Average running time (s)
Figure 12. Performance in the general and special scenario with = 10, = 12.
49
(a) Minimum efficiency (%) for the
general scenario
(a) Minimum efficiency (%) for the
general scenario
(b) Minimum efficiency (%) for the
special scenario
(b) Minimum efficiency (%) for the
special scenario
(c) Average running time (s)
(c) Average running time (s)
Figure 13. Performance in the general and
special scenario with = 25, = 30.
Figure 14. Performance in the general and
special scenario with = 30, = 36.
50
(a) Minimum efficiency (%) for the
general scenario
(b) Minimum efficiency (%) for the
special scenario
(c) Average running time (s)
Figure 15. Performance in the general and special scenario with = 42, = 50.
According to the results in these figures, we find that the worst performance of the algorithm
is 61.91%, 49.26% for the simplified and the general scenario, respectively. The results agree
with the theoretical proof of minimum efficiency in Section 5.1.2. However, we can see that
the actual algorithm’s performance is much better than the one in theory. Indeed, the average
values of minimum efficiency among all test cases are 75.70% in the simplified scenario and
63.96% in the general scenario, much larger than
= 50% and


38.7% in theory.
Moreover, the standard deviations are 7.02 and 9.79, which are relatively small, meaning
that most of the achieved efficiencies are close to the average. Besides, when , our
algorithm produces the results, which are at least 78.42% as good as the optimal answer on
average, far beyond the theoretical ratio of (1 −
).
6.2.3. Running time
The figures presented in the previous section also show our algorithm’s average running
time on each case of (, ). As we mentioned in Section 5.1.3, the time complexity of our
greedy method is (
2
× 
3
). Therefore, when  grows more significant in our experiment,
51
the algorithm takes much longer time to produce an answer. As shown in the figures with
× = 42 × 50, we can see the maximum value among the average running time is just 0.708
seconds for the simplified scenario, and 40.125 seconds for the general scenario. In that
condition,  has the largest value. It is noticeable that a cell in a 42 × 50 grid is much smaller
than a circle of radius in real condition of the Hanoi bus system. It is not necessary to set
bigger values for and . Moreover, in a map with × = 42 × 50, the value 1000 of  is
already huge. Hence, the configuration of (, , ) as (1000, 42, 50) can be considered
sufficiently large for Hanoi’s bus system.
6.3. Numerical results of meta-heuristic algorithms
In the genetic algorithm, if the best found solution is not changed after 10 consecutive
generations then the process is stopped. In addition, if the average number of observable
critical squares of all solutions in the population does not increase after 10 consecutive
generations then we will stop the algorithm. The population size is set as 100×n. For the
simulated annealing algorithm, we set T_max = 100, T_dec = 0.1 and L = 200. The reason
for choosing these hyperparameters is to guarantee the efficiency, without consuming too
much running time. This is based on the fact that the bigger the hyperparameters are, the
more chance that better results can be found, when running meta-heuristics.
The tables below show the results for the meta-heuristics in the simplified scenario and the
general scenario, compared to the greedy approximation algorithm. The most important “eff.
chg.” columns show the percentage increase of the efficiency of the meta-heuristics
compared to the approximation algorithm. It is the subtraction between the minimum
efficiency (written as “min. eff.”) of GA (or SA) and the minimum efficiency of the
approximation algorithm. The average running time columns (written as “avg. time”) for the
meta-heuristics is similar to that of the approximation algorithm, since the way to calculate
the numbers in these columns is identical to that of the approximation algorithm.
Table 2. Meta-heuristics performance compared to the approximation algorithm’s results in
the simplified scenario.
p*q
c
r
Appro. algo
GA
SA
min.
eff.
(%)
avg.
time
(s)
min.
eff.
(%)
eff.
chg.
(%)
avg.
time
(s)
min.
eff.
(%)
eff.
chg.
(%)
avg.
time
(s)
10
10
0.5
75.00
0.003
75.00
0.00
0.999
75.00
0.00
3.763
52
x
12
1.0
80.00
0.001
90.00
+10.00
1.507
90.00
+10.00
5.009
2.0
100.00
0.001
100.00
0.00
0.641
100.00
0.00
6.539
20
0.5
84.21
0.004
89.47
+5.26
5.197
84.21
0.00
6.993
1.0
81.25
0.004
81.25
0.00
1.560
81.25
0.00
9.793
2.0
100.00
0.003
100.00
0.00
0.839
100.00
0.00
11.844
25
0.5
76.00
0.004
76.00
0.00
3.941
76.00
0.00
8.792
1.0
77.27
0.006
77.27
0.00
2.685
77.27
0.00
10.679
2.0
91.67
0.005
91.67
0.00
2.296
91.67
0.00
15.510
50
0.5
80.00
0.013
82.50
+2.50
9.565
80.00
0.00
16.416
25
x
30
10
2.0
87.50
0.002
100.00
+12.50
1.274
100.00
+12.50
4.696
25
1.0
61.91
0.007
61.91
0.00
3.435
61.91
0.00
6.751
2.0
64.00
0.004
68.00
+4.00
4.308
68.00
+4.00
9.919
3.0
81.25
0.007
87.50
+6.25
1.703
87.50
+6.25
9.447
50
1.0
61.91
0.013
66.67
+4.76
9.702
66.67
+4.76
12.846
2.0
77.78
0.013
80.00
+2.22
6.187
80.00
+2.22
16.087
3.0
83.67
0.010
83.67
0.00
6.632
83.67
0.00
20.442
100
1.0
72.62
0.032
73.81
+1.19
31.434
71.43
-1.19
25.020
2.0
76.74
0.032
81.40
+4.65
13.948
79.07
+2.33
32.726
3.0
78.26
0.037
83.70
+5.44
8.180
82.61
+4.35
43.995
200
1.0
76.07
0.087
77.91
+1.84
15.186
77.30
+1.23
51.574
53
2.0
73.08
0.106
75.82
+2.75
37.574
75.82
+2.75
69.895
3.0
75.82
0.129
78.02
+2.20
12.122
79.12
+3.30
89.346
300
1.0
75.49
0.204
77.82
+2.33
37.513
77.43
+1.95
78.471
2.0
75.27
0.256
78.18
+2.91
25.396
78.18
+2.91
112.056
3.0
71.94
0.342
75.18
+3.24
24.886
75.54
+3.60
132.718
400
1.0
72.57
0.348
71.39
-1.18
26.509
72.27
-0.30
109.202
2.0
73.78
0.503
79.73
+5.95
42.171
80.27
+6.49
156.562
30
x
36
25
1.0
66.67
0.007
72.22
+5.55
4.112
66.67
0.00
6.075
2.0
71.43
0.006
76.19
+4.76
4.247
76.19
+4.76
8.130
3.0
76.19
0.005
80.95
+4.76
3.287
80.95
+4.76
10.673
50
1.0
76.92
0.013
76.92
0.00
17.102
76.92
0.00
10.837
2.0
74.47
0.011
76.60
+2.13
15.597
76.60
+2.13
15.152
3.0
76.60
0.012
80.85
+4.26
3.532
80.85
+4.26
18.778
100
1.0
72.50
0.028
73.75
+1.25
8.655
71.25
-1.25
22.523
2.0
71.91
0.028
76.40
+4.49
10.247
76.40
+4.49
31.767
3.0
79.57
0.030
79.57
0.00
6.152
80.65
+1.08
37.397
200
1.0
75.17
0.086
74.50
-0.67
13.864
74.50
-0.67
44.190
2.0
71.01
0.080
70.41
-0.59
9.898
71.01
0.00
60.503
3.0
76.33
0.112
76.33
0.00
18.358
76.92
+0.59
77.446
400
1.0
73.31
0.277
74.23
+0.92
51.930
70.55
-2.76
91.772
54
2.0
69.23
0.331
74.56
+5.33
42.469
73.37
+4.14
122.261
3.0
75.43
0.597
80.06
+4.62
25.144
80.35
+4.91
168.479
42
x
50
25
1.0
73.68
0.007
78.95
+5.26
2.929
78.95
+5.26
5.844
2.0
77.27
0.006
77.27
0.00
8.286
77.27
0.00
7.110
3.0
77.27
0.006
81.82
+4.55
5.420
81.82
+4.55
9.220
50
1.0
73.53
0.017
79.41
+5.88
8.193
76.47
+2.94
9.691
2.0
71.74
0.013
80.44
+8.70
6.891
78.26
+6.52
13.649
3.0
75.00
0.012
77.50
+2.50
6.744
77.50
+2.50
14.958
100
1.0
73.13
0.035
73.13
0.00
11.013
70.15
-2.99
18.700
2.0
74.70
0.029
75.90
+1.21
9.790
73.49
-1.21
24.223
3.0
73.56
0.030
77.01
+3.45
22.047
75.86
+2.30
29.031
200
1.0
66.67
0.083
71.07
+4.40
38.095
67.93
+1.26
38.137
2.0
73.33
0.088
72.78
-0.55
15.727
72.22
-1.11
52.572
3.0
73.05
0.094
74.85
+1.80
49.978
72.46
-0.60
61.153
500
1.0
72.49
0.429
73.52
+1.03
57.030
74.04
+1.54
102.071
2.0
75.82
0.574
77.00
+1.17
199.082
74.65
-1.17
124.053
3.0
67.79
0.708
68.90
+1.12
71.113
66.89
-0.89
158.714
Avg. min. efficiency change (%)
+2.69 +1.94
55
Table 3. Meta-heuristics performance compared to the approximation algorithm’s results in
the general scenario.
p*q
c
r
Appro. algo
GA
SA
min.
eff.
(%)
avg.
time
(s)
min.
eff.
(%)
eff.
chg.
(%)
avg.
time
(s)
min.
eff.
(%)
eff.
chg.
(%)
avg.
time (s)
10
x
12
10
0.5
70.00
0.003
70.00
0.00
0.820
70.00
0.00
13.264
1.0
75.00
0.003
75.00
0.00
0.863
75.00
0.00
12.190
2.0
88.89
0.004
88.89
0.00
0.931
88.89
0.00
18.646
20
0.5
66.67
0.007
72.22
+5.55
2.901
72.22
+5.55
17.562
1.0
61.11
0.008
66.67
+5.56
1.185
61.11
0.00
28.789
2.0
78.95
0.011
78.95
0.00
1.112
78.95
0.00
42.148
25
0.5
66.67
0.009
66.67
0.00
1.294
66.67
0.00
27.154
1.0
68.18
0.011
68.18
0.00
2.510
68.18
0.00
31.628
2.0
72.00
0.015
72.00
0.00
1.407
72.00
0.00
56.079
30
0.5
67.86
0.013
67.86
0.00
2.724
67.86
0.00
28.030
1.0
62.96
0.014
62.96
0.00
1.582
62.96
0.00
42.721
2.0
70.00
0.019
73.33
+3.33
1.606
70.00
0.00
62.367
40
0.5
66.67
0.020
66.67
0.00
3.918
66.67
0.00
36.329
1.0
62.50
0.023
68.75
+6.25
2.248
62.50
0.00
63.075
2.0
70.00
0.035
72.50
+2.50
2.223
70.00
0.00
87.449
50
0.5
61.29
0.027
64.52
+3.23
1.754
61.29
0.00
53.502
56
1.0
63.16
0.031
63.16
0.00
1.848
63.16
0.00
69.516
2.0
74.00
0.059
78.00
+4.00
2.532
74.00
0.00
129.199
25
x
30
10
0.5
77.78
0.003
88.89
+11.11
2.845
88.89
+11.11
7.352
1.0
66.67
0.004
66.67
0.00
2.305
66.67
0.00
6.575
2.0
75.00
0.003
75.00
0.00
0.629
75.00
0.00
8.494
3.0
80.00
0.003
80.00
0.00
1.866
80.00
0.00
12.423
25
0.5
75.00
0.011
75.00
0.00
8.517
75.00
0.00
13.634
1.0
75.00
0.010
80.00
+5.00
7.825
80.00
+5.00
15.842
2.0
66.67
0.009
71.43
+4.76
4.053
71.43
+4.76
21.026
3.0
65.22
0.012
65.22
0.00
1.356
65.22
0.00
33.002
50
0.5
65.71
0.027
71.43
+5.72
4.281
71.43
+5.72
24.668
1.0
72.73
0.023
72.73
0.00
4.653
72.73
0.00
29.608
2.0
68.89
0.025
71.11
+2.22
5.944
68.89
0.00
44.425
3.0
61.91
0.029
61.91
0.00
1.837
61.91
0.00
71.817
100
0.5
62.65
0.072
67.47
+4.82
10.507
63.86
+1.20
49.002
1.0
58.21
0.068
58.21
0.00
3.677
58.21
0.00
59.469
2.0
62.64
0.097
61.54
-1.10
6.099
65.93
+3.30
93.112
3.0
62.50
0.123
62.50
0.00
3.280
62.50
0.00
137.105
200
0.5
60.82
0.239
64.33
+3.51
17.677
64.33
+3.51
104.522
1.0
53.10
0.285
53.10
0.00
9.853
53.10
0.00
151.528
57
2.0
57.38
0.387
58.47
+1.09
8.202
58.47
+1.09
214.758
3.0
62.91
0.746
62.25
-0.66
6.543
62.91
0.00
347.961
300
0.5
50.89
0.440
51.79
+0.89
11.955
51.79
+0.89
174.918
1.0
54.80
0.661
53.43
-1.37
15.331
54.80
0.00
249.174
2.0
55.99
1.272
58.45
+2.47
16.381
57.75
+1.76
351.679
3.0
62.89
2.565
62.89
0.00
5.235
62.89
0.00
797.241
400
0.5
50.34
0.858
50.34
0.00
18.649
52.04
+1.70
251.393
1.0
53.40
1.349
52.72
-0.68
17.310
53.40
0.00
391.696
2.0
56.76
3.080
57.57
+0.81
33.678
57.84
+1.08
573.461
3.0
63.07
6.886
64.46
+1.39
23.599
63.07
0.00
1389.834
30
x
36
10
0.5
77.78
0.003
77.78
0.00
4.151
77.78
0.00
6.874
1.0
77.78
0.003
88.89
+11.11
4.908
88.89
+11.11
7.201
2.0
75.00
0.003
75.00
0.00
0.755
75.00
0.00
10.635
3.0
83.33
0.003
83.33
0.00
0.826
83.33
0.00
11.361
25
0.5
84.21
0.010
84.21
0.00
6.806
84.21
0.00
12.698
1.0
60.00
0.010
60.00
0.00
5.489
60.00
0.00
14.324
2.0
76.00
0.009
80.00
+4.00
5.559
76.00
0.00
19.747
3.0
70.83
0.009
70.83
0.00
4.603
70.83
0.00
23.893
50
0.5
66.67
0.027
66.67
0.00
6.502
64.10
-2.56
23.598
1.0
62.86
0.025
62.86
0.00
3.180
62.86
0.00
31.891
58
2.0
58.54
0.025
68.29
+9.76
4.499
68.29
+9.76
37.372
3.0
62.16
0.029
62.16
0.00
2.384
62.16
0.00
58.677
100
0.5
69.44
0.066
70.83
+1.39
9.913
70.83
+1.39
42.094
1.0
63.64
0.067
67.53
+3.90
35.936
66.23
+2.60
46.238
2.0
55.56
0.082
55.56
0.00
6.782
56.67
+1.11
78.931
3.0
62.69
0.104
62.69
0.00
6.084
62.69
0.00
131.681
200
0.5
66.91
0.205
66.91
0.00
16.704
66.91
0.00
87.527
1.0
54.17
0.250
54.17
0.00
12.160
54.17
0.00
115.716
2.0
54.95
0.354
56.04
+1.10
10.100
56.04
+1.10
189.385
3.0
62.90
0.529
62.90
0.00
7.966
62.90
0.00
287.857
300
0.5
60.77
0.442
60.29
-0.48
20.091
60.77
0.00
143.786
1.0
52.34
0.568
55.32
+2.98
19.337
55.32
+2.98
187.807
2.0
55.20
1.095
53.76
-1.43
17.690
55.56
+0.36
325.873
3.0
61.81
1.969
66.32
+4.51
21.368
65.28
+3.47
431.393
400
0.5
52.30
0.769
51.97
-0.33
21.101
52.63
+0.33
197.473
1.0
50.14
1.070
50.14
0.00
25.388
50.14
0.00
282.635
2.0
52.98
1.993
53.26
+0.28
20.407
53.26
+0.28
479.796
3.0
62.96
4.006
62.96
0.00
20.036
62.96
0.00
1039.680
500
0.5
51.50
1.140
51.50
0.00
28.696
51.50
0.00
283.167
1.0
53.72
1.751
54.65
+0.93
51.960
54.19
+0.47
370.321
59
2.0
53.44
3.777
54.32
+0.89
60.282
54.10
+0.66
622.763
3.0
60.55
10.806
61.83
+1.28
42.663
61.19
+0.64
754.858
42
x
50
10
0.5
100.00
0.003
100.00
0.00
0.602
100.00
0.00
4.324
1.0
88.89
0.003
88.89
0.00
3.451
88.89
0.00
6.030
2.0
70.00
0.003
70.00
0.00
1.564
70.00
0.00
8.257
3.0
62.50
0.003
62.50
0.00
1.355
62.50
0.00
8.946
25
0.5
83.33
0.011
83.33
0.00
1.349
83.33
0.00
11.212
1.0
77.78
0.010
77.78
0.00
3.878
77.78
0.00
12.074
2.0
79.17
0.010
79.17
0.00
8.963
75.00
-4.17
15.365
3.0
63.64
0.010
72.73
+9.09
1.077
63.64
0.00
19.225
50
0.5
65.63
0.027
65.63
0.00
6.454
65.63
0.00
19.384
1.0
74.36
0.028
79.49
+5.13
8.944
76.92
+2.56
21.638
2.0
62.50
0.026
65.00
+2.50
4.026
65.00
+2.50
30.237
3.0
59.09
0.027
63.64
+4.55
7.275
63.64
+4.55
36.256
100
0.5
70.31
0.071
73.44
+3.13
13.921
70.31
0.00
35.943
1.0
65.82
0.066
67.09
+1.27
8.162
67.09
+1.27
42.741
2.0
59.14
0.078
62.37
+3.23
6.193
62.37
+3.23
62.031
3.0
56.32
0.086
56.32
0.00
3.733
56.32
0.00
81.768
200
0.5
69.57
0.200
72.46
+2.90
32.199
71.74
+2.17
69.035
1.0
61.33
0.198
61.33
0.00
13.926
61.33
0.00
89.575
60
2.0
54.48
0.256
53.79
-0.69
6.271
54.48
0.00
140.323
3.0
53.89
0.340
57.78
+3.89
11.633
57.78
+3.89
179.273
300
0.5
62.30
0.391
63.39
+1.09
25.115
62.30
0.00
106.710
1.0
54.74
0.426
55.26
+0.53
12.106
55.26
+0.53
145.949
2.0
55.35
0.640
52.03
-3.32
20.631
55.35
0.00
201.175
3.0
51.64
0.842
54.55
+2.91
13.443
54.55
+2.91
295.075
400
0.5
56.03
0.584
54.86
-1.17
19.659
56.03
0.00
158.071
1.0
53.46
0.664
53.46
0.00
16.624
53.46
0.00
214.609
2.0
54.93
1.128
54.93
0.00
17.387
54.93
0.00
343.229
3.0
51.37
2.096
51.10
-0.28
25.342
52.75
+1.37
438.352
500
0.5
62.58
0.999
61.98
-0.60
34.305
62.58
0.00
196.484
1.0
55.56
1.196
55.56
0.00
30.234
55.56
0.00
252.401
2.0
57.44
2.148
57.44
0.00
22.872
57.44
0.00
494.512
3.0
50.97
4.239
51.18
+0.21
32.244
52.26
+1.29
636.920
1000
0.5
49.26
4.447
49.26
0.00
60.543
49.26
0.00
534.257
1.0
52.17
7.101
52.32
+0.15
72.606
52.32
+0.15
801.631
2.0
54.83
16.733
56.05
+1.22
191.155
55.49
+0.67
1063.719
3.0
53.51
40.125
53.51
0.00
64.266
53.51
0.00
2192.211
Avg. min. efficiency change (%)
+1.28 +0.91
61
6.4. Comparison of results between the approximation
algorithm and the meta-heuristics
Based on the tables in section 6.3, we can see that meta-heuristics outperformed the
approximation algorithm by just a small margin (from 1% to 3% on average), though
consuming much more running time.
On average, GA helps increase the minimum efficiency by 2.69% in the simplified case and
1.28% in the general case, while SA helps increase the minimum efficiency by 1.94% in the
simplified case and 0.91% in the general case. In spite of that, there is still a considerable
number of times where meta-heuristics are better, which is quite promising. Out of all 172
test cases including simplified and general scenarios, there are 86 times (50% of time) that
GA is better than the approximation algorithm, and for SA there are 72 times (41.86% of
time). There are only 16 times (9.3% of time) that GA is worse than the approximation
algorithm, and for SA there are only 13 times (7.56% of time). These numbers mean that
meta-heuristics are usually a better choice for efficiency. GA is slightly better than SA in
this problem, however the difference is pretty small, and there are still cases where SA
outperformed GA.
To be practical, an algorithm should run fast enough. We see that meta-heuristics improves
the quality of the results, but their running time is a big problem. Compared to the
approximation approach, GA and SA are usually hundred or even thousand times slower,
but does not guarantee a better solution. In these meta-heuristics, the bigger the
hyperparameters, the more chance that a good result will be found. However making those
hyperparameters bigger than the current setup is not a suitable act since the current running
time of our meta-heuristics are already huge, and it would take days or months to try a new
set of hyperparameters, which is infeasible.
All of these observations suggest that our approximation algorithm’s results were not far
from optimal. However, we should still consider the meta-heuristics approaches in real life
if time budget allows.
6.5. Discussion
This research is in line with the implementation of the mobile air quality monitoring system
in Hanoi, as presented in [24]. With the system, the numerical results obtained from this
study can be applied in our real system as follows. As the Hanoi bus map has the size of 20
km × 30 km, and the sensing range of the used sensor is about 120 m, our network topology
corresponds to the test case of = 42, = 50, and = 0.5. Figs. 10, 14 depict the minimum
efficiency and running time obtained in this test case. As shown, for all combinations of (,
62
) (i.e., the number of sensors and the number of turn-on times of each sensor), our proposed
solution achieved the minimum efficiency from 49.26% to 100% when the number of critical
regions varies from 10 to 1000. Moreover, when the number of critical regions is 1000 and
the sensors are sufficiently powered by the bus to be turned on all the time during the bus
routes, the number of sensors required to cover the maximum number of critical regions is
38. When the number of sensors is reduced to 16, we achieve 90% of the maximal coverage.
We also deduce that to save about 95 percent of sensors’ energy consumption, we need about
40 sensors.
This study presented a theoretical approach to address the opportunistic sensing in mobile
air quality monitoring systems. We have made some ideal assumptions, including the disk-
based sensing paradigm and the unchanging nature of the critical regions. In practice, the
sensing model is more complicated, and the measurement accuracy may be affected by many
factors such as temperature, humidity [25]. Besides, the critical regions can also change over
time. For such additional problems, we need to leverage other techniques (e.g., machine
learning [26, 27]) that can capture the network’s dynamic nature and forecast its future state;
thereby, allowing for decision making that is adaptable to the network’s variety. These
problems will be addressed in our future work.
63
This thesis investigated the opportunistic sensing optimization (OSO) problem in the mobile
air quality monitoring system. First, we have mathematically formulated the OSO problem
and proved its NP-hardness. Second, we proposed a polynomial-time


and
approximation algorithms, which are for OSO in the general scenario and the simplified
one, respectively. Finally, we proposed two meta-heuristics approaches: genetic algorithm
and simulated annealing, to further test the efficiency of the greedy approximation algorithm
and optimize the results.
We have extensively evaluated the approximation algorithms on the real bus map, similar to
the one in Hanoi, Vietnam. The evaluation results showed that the proposed algorithm’s
average performance ratio is 63.96% for the general scenario and 75.70% for the simplified
one. Moreover, the maximum value of average running time is 40.125 seconds among all
test cases. Especially, if the energy supply is sufficient for the sensor to be turned on for the
whole route (i.e., the special case where ), the approximation ratio will become (1 −
),
which is almost twice as much as


, and also the average performance ratio becomes
78.42%.
For the meta-heuristic methods, experiments show that these algorithms only increase the
goodness of the results by 1% to 3% on average, but have a larger running time than the
greedy algorithm. From there, we see that the approximation algorithm in particular is
already a feasible solution in practice without mentioning any other complicated tools.
However, when time budget allows, meta-heuristics should be considered since they usually
produce better solutions.
Chapter 7. Conclusion
64
[1] Nguyen, V. D., Le Nguyen, P., Nguyen, K., & Do, P. T. (2022). Constant
approximation for opportunistic sensing in mobile air quality monitoring system.
Computer Networks, 202, 108646.
[2] V.D. Nguyen, P. Le Nguyen, T.H. Nguyen, K. Nguyen, P.T. Do, An


-
Approximation Algorithm for Maximizing Coverage Capability in Mobile Air
Quality Monitoring Systems, in: Proc. IEEE NCA, 2020, pp. 1–4.
[3] V.D. Nguyen, P. Le Nguyen, T.H. Nguyen, P.T. Do, A
-Approximation Algorithm
for Target Coverage Problem in Mobile Air Quality Monitoring Systems, in: Proc.
IEEE GLOBECOM, 2020, pp. 1–6.
Published papers
65
[1] Ambient air pollution: A global assessment of exposure and burden of disease, 2021,
(Access date: 2021 June), https://www.who.int/phe/publications/air-pollution-global-
assessment/en/.
[2] A. Lozano, J. Usero, E. Vanderlinden, J. Raez, J. Contreras, B. Navarrete, Air quality
monitoring network design to control nitrogen dioxide and ozone, applied in Malaga,
Spain, Microchem. J. 93 (2) (2009) 164–172.
[3] O. Postolache, M. Pereira, P. Girao, Smart Sensor Network for Air Quality
Monitoring Applications, in: Proc. IEEE Instrumentation and Measurement
Technology Conference, 2005, pp. 537–542.
[4] J.-H. Liu, Y.-F. Chen, T.-S. Lin, D.-W. Lai, T.-H. Wen, C.-H. Sun, J.-Y. Juang,
J.Jiang, Developed urban air quality monitoring system based on wireless sensor
networks, in: Proc. IEEE ICST, 2011, pp. 549–554.
[5] K. Zheng, S. Zhao, Z. Yang, X. Xiong, W. Xiang, Design and implementation of
LPWA-based air quality monitoring system, IEEE Access 4 (2016) 3238–3245.
[6] R. Yasmin, J. Petäjäjärvi, K. Mikhaylov, A. Pouttu, Large and Dense LoRaWAN
Deployment to Monitor Real Estate Conditions and Utilization Rate, in: Proc. IEEE
PIMRC, 2018, pp. 1–6.
[7] Air visual, 2021, (Access date: 2021 June), https://www.airvisual.com/vietnam/hanoi
[8] T. Villa, F. Gonzalez, B. Miljievic, Z. Ristovski, Morawska, An overview of small
unmanned aerial vehicles for air quality measurements: Present applications and
future prospectives, Sensors 16 (2016).
[9] Y. Yang, Z. Bai, Z. Hu, Z. Zheng, K. Bian, L. Song, AQNet: Fine-grained 3D spatio-
temporal air quality monitoring by aerial-ground WSN, in: Proc. IEEE INFOCOM
WKSHPS, 2018, pp. 1–2.
[10] S. Kaivonen, E.C.-H. Ngai, Real-time air pollution monitoring with sensors on city
bus, Digit. Commun. Netw. 6 (1) (2020) 23–30.
References
66
[11] S.M. Biondi, V. Catania, S. Monteleone, C. Polito, Bus as a sensor: A mobile sensor
nodes network for the air quality monitoring, in: Proc. IEEE WiMob, 2017, pp. 272–
277.
[12] X. Cheng, D.-Z. Du, L. Wang, B. Xu, Relay sensor placement in wireless sensor
networks, Wirel. Netw. 14 (3) (2008) 347–355.
[13] F. Senel, M. Younis, Relay node placement in structurally damaged wireless sensor
networks via triangular steiner tree approximation, Comput. Commun. 34 (16)
(2011) 1932–1941.
[14] A. Shan, X. Xu, Z. Cheng, Target coverage in wireless sensor networks with
probabilistic sensors, Sensors 16 (9) (2016).
[15] X. Zhu, J. Li, M. Zhou, Target coverage-oriented deployment of rechargeable
directional sensor networks with a mobile charger, IEEE Internet Things J. 6 (3)
(2019) 5196–5208.
[16] D. Arivudainambi, S. Balaji, T.S. Poorani, Sensor deployment for target coverage in
underwater wireless sensor network, in: Proc. International Conference on
Performance Evaluation and Modeling in Wired and Wireless Networks (PEMWN),
2017, pp. 1–6.
[17] N.T. Hanh, H.T.T. Binh, N. Van Son, P.N. Lan, Minimal Node Placement for
Ensuring Target Coverage With Network Connectivity and Fault Tolerance
Constraints in Wireless Sensor Networks, in: Proc. IEEE CEC, 2019, pp. 2923–2930.
[18] X. Gao, Z. Chen, F. Wu, G. Chen, Energy efficient algorithms for -sink minimum
movement target coverage problem in mobile sensor network, IEEE/ACM Trans.
Netw. 25 (6) (2017) 3616–3627.
[19] N.T. Nguyen, B. Liu, S. Wang, On new approaches of maximum weighted target
coverage and sensor connectivity: Hardness and approximation, IEEE Trans. Netw.
Sci. Eng. (2019) 1.
[20] M. Rout, R. Roy, Self-deployment of mobile sensors to achieve target coverage in
the presence of obstacles, IEEE Sens. J. 16 (14) (2016) 5837–5842.
[21] R. Choudhuri, R.K. Das, Coverage of targets in mobile sensor networks with
restricted mobility, IEEE Access 6 (2018) 10803–10813.
67
[22] D.S. Hochbaum, Approximating covering and packing problems: set cover, vertex
cover, independent set, and related problems, in: Approximation Algorithms for NP-
Hard Problems, PWS Publishing Company, 1996, pp. 94–143.
[23] C. Chekuri, A. Kumar, Maximum coverage problem with group budget constraints
and applications, in: Approximation, Randomization, & Combinatorial Optimization:
Algorithms & Techniques, 2004, pp. 72–83.
[24] V.A. Nguyen, V.H. Vu, V.S. Doan, T.H. Nguyen, P.T. Do, K. Nguyen, P.L. Nguyen,
M.T. Le, Realizing mobile air quality monitoring system: Architectural concept and
device prototype, in: Asia Pacific Conference on Communications (APCC), 2021.
[25] M. Balanescu, I. Oprea, G. Suciu, M.-A. Dobrea, C. Balaceanu, R.-I. Ciobanu, C.
Dobre, A study on data accuracy for IoT measurements of PMs concentration, in:
2019 22nd International Conference on Control Systems and Computer Science
(CSCS), 2019, pp. 182–187.
[26] M. Dobrea, A. Bădicu, M. Barbu, O. Subea, M. Bălănescu, G. Suciu, A. Bîrdici, O.
Orza, C. Dobre, Machine learning algorithms for air pollutants forecasting, in: 2020
IEEE 26th International Symposium for Design and Technology in Electronic
Packaging (SIITME), 2020, pp. 109–113.
[27] M.H. Nguyen, P. Le Nguyen, K. Nguyen, V.A. Le, T.-H. Nguyen, Y. Ji, PM2.5
prediction using genetic algorithm-based feature selection and encoder-decoder
model, IEEE Access 9 (2021) 57338–57350.
[28] Michalewicz Z, Schoenauer M (1996), Evolutionary algorithms for constrained
parameter optimization problems. Evol Comput 4(1):1–32
[29] Holland JH (1975), Adaptation in natural and artificial systems. The U. of Michigan
Press
[30] Scott Kirkpatrick, C Daniel Gelatt, and Mario P Vecchi. Optimization by simulated
annealing. science, 220(4598):671–680, 1983.
https://doi.org/10.1126/science.220.4598.671.
[31] Vladimír Černỳ. Thermodynamical approach to the traveling salesman problem: An
efficient simulation algorithm. Journal of optimization theory and applications,
45(1):41–51, 1985. https://doi.org/10.1007/bf00940812.
[32] El-Ghazali Talbi. Metaheuristics: from design to implementation, volume 74. John
Wiley & Sons, 2009. https://doi.org/10.1002/9780470496916.
CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM
Độc lập Tự do Hạnh phúc
BẢN XÁC NHẬN CHỈNH SỬA LUẬN VĂN THẠC
Họ và tên tác giả luận văn: Nguyễn Việt Dũng
Đề tài luận văn: Triển khai tối ưu các hệ thống quan trắc không khí di động
thông minh
Chuyên ngành: Khoa học dữ liệu và Trí tuệ nhân tạo
Mã số SV: 20202342M
Tác giả, Người hướng dẫn khoa học và Hội đồng chấm luận văn xác nhận tác giả đã
sửa chữa, bổ sung luận văn theo biên bản họp Hội đồng ngày 29/10/2022 với các nội
dung sau:
- Thêm giới thiệu chi tiết hơn về các nghiên cứu có liên quan trong chương 2.
- Đổi tên chương 3 t “Problem formulation & hardness” thành “Problem
formulation”.
- Thêm phát biểu về bài toán opportunistic sensing optimization trước khi viết tắt
thành OSO.
- Đổi tên phần 3.2 thành “Mathematical formulation of OSO”.
- Thêm giải thích rõ hơn về hàm mục tiêu và các điều kiện trong mục 3.2.
- Thêm lý do giải thích vì sao sử dụng thuật toán quy hoạch động:
“In this simplified scenario, our dynamic programming approach guarantees that
the set found by the submaxSet function is always maximum. thus the number
mentioned in the previous section 5.1.1.2 will be equal to 1. Later we will show
that we cannot use dynamic programming in the general scenario, and we will
need another greedy sub-process which has a lower performance ratio for that.”
- Thêm một số giải thích chi tiết về các thuật toán meta-heuristics lý do lựa chọn
sử dụng chúng, cụ thể như sau:
+ They are appropriate methods to verify efficiency of the approximation
algorithm, since their tremendous performance in practice was shown in
numerous research papers, especially researches related to air monitoring
systems. If the greedy approximation approach is decent, the experimental results
produced by it should be competitive to the ones produced by the chosen meta-
heuristics. It is indeed true, and we will show the experimental results supporting
this observation later in this thesis.
+ “Two meta-heuristics, the genetic algorithm and the simulated annealing
algorithm, are chosen to solve the OSO problem because of their simplicity and
efficiency in practice. Related researches about air monitoring systems also
deployed these methods to solve challenging problems, and the results usually
show that they are good choices for creating a solution.”
- Thêm giải thích cho các hình vẽ và bảng biểu.
- Thêm mô tả input và output cho các thuật toán.
- Thêm mục 6.4. “Comparison of results between the approximation algorithm and
the meta-heuristics” chuyển mục 6.4 cũ thành mục 6.5. “Discussion”.
Ngày tháng năm
Giáo viên hướng dẫn Tác giả luận văn
CHỦ TỊCH HỘI ĐỒNG